Episódios

  • TCP-Talks: Focus on Value: How FinOps Transformed from Cost Cops to Business Enablers
    Aug 6 2025
    For this special edition of TCP Talks, Justin Brodley is joined by four distinguished guests from the FinOps Foundation following the recent FinOps X conference in San Diego. Rob Martin, Mike Fuller, Graham Murphy, and the TCP team dive deep into the evolution of FinOps from pure cloud cost management to the broader “Cloud Plus” world, the rapid adoption of Focus 1.2, and how AI is transforming both what we manage and how we manage it. About Our Guests Rob Martin has been with the FinOps Foundation for four years, currently focusing on the AI working group, ITAM initiatives, and the rapidly growing public sector adoption. His experience spans training development and strategic initiatives that have helped shape the foundation’s direction during a period of explosive growth. Mike Fuller is one of the founding members of the FinOps Foundation and co-author of the Cloud FinOps book. As a member of the Focus project steering committee, he’s been instrumental in developing the specification that’s standardizing cloud billing data across the industry. Graham Murphy serves as Director of SaaS P&L for Technology One in Brisbane. With 8-9 years in FinOps and recently nominated as both a FinOps Ambassador and Focus Ambassador, Graham brings a practitioner’s perspective from the APAC region and insights on implementing Focus in a SaaS environment. Conference Growth and Evolution The 2025 FinOps X conference in San Diego marked a significant milestone with approximately 2,000 attendees—a substantial increase from the previous year. Despite the larger venue, the conference maintained its intimate feel, allowing for meaningful connections and knowledge sharing. 2:49 Graham: “AI definitely grew a lot this year. A lot more talk about how you go about managing AI, how FinOps is going to drive better value out of your AI investments. And also just a lot of people trying to understand where to start.” The conference format evolved with more senior leadership participation, including executives from PepsiCo, Ticketmaster, and Nubank sharing their FinOps journeys. The quality of presentations notably improved, with practitioners willing to share deeper insights into their mature FinOps programs. The Cloud Plus Revolution A dominant theme throughout the conference was the expansion beyond traditional cloud cost management into what the foundation calls “Cloud Plus”—encompassing SaaS, data center, licensing, and AI costs. 4:31 Mike: “We saw that sort of echoed quite well across many of the breakout sessions by practitioners exactly how they’re sort of incorporating other costs into the conversation of their practices.” 6:56 Rob: “Ticketmaster said something that I loved, which was that they were ‘happily hybrid’… we understand that we’ve got all these different modalities that we’re going to use to deliver value—SaaS models and data center models and cloud models.” This shift represents a fundamental change in how organizations view FinOps, moving from a cloud-specific practice to a comprehensive IT financial management approach. Focus 1.2: The Game Changer The release of Focus 1.2 at the conference marked a pivotal moment for billing data standardization. The specification now includes support for SaaS costs and virtual currencies like tokens and credits—critical for AI and modern consumption-based pricing models. 16:23 Mike: “The big tagline for Focus 1.2 was sort of keeping up with how FinOps is approaching that Cloud Plus world with the introduction of columns that really help bring in SaaS costs.” Focus Adoption Momentum The adoption curve for Focus has been remarkable: 19:53 Rob: “Last year, people were saying ‘Tell me what is Focus?’ In Barcelona, they asked ‘How do I actually start using it?’ This year, almost every conversation was ‘I’m trying to get my data into Focus, and this is how we’re doing it.'” Major milestones include: AWS and Microsoft announcing Focus 1.2 supportGrafana, Databricks, and Snowflake joining as data generatorsEuropean Union standardizing on Focus for inter-agency billingUK, Japanese, Canadian, and Brazilian governments adopting Focus as their standard AI: From Experimentation to Operation The transformation in AI cost management over the past year has been dramatic. Organizations have moved from wondering if they should worry about AI costs to actively managing significant AI spending. 13:31 Rob: “We’ve got private connections to some customers where we’re serializing 400 gigabits per second line rate to them, so that they can very, very rapidly move libraries of data to tightly schedule back to back with perhaps maybe a GPU farm instance.” The Dual AI Challenge Organizations face two distinct AI-related challenges: FinOps for AI: Managing the costs of AI workloads, including training, inference, and data movementAI for FinOps: Using AI to enhance FinOps practices through automation and intelligent recommendations ...
    Exibir mais Exibir menos
    48 minutos
  • TCP Talks: The David vs. Goliath of Cloud Storage: Chris Opat from Backblaze on Challenging Hyperscalers
    Jul 22 2025
    For this special edition of TCP Talks, Justin Brodley and Matthew Kohn are joined by Chris Opat, SVP of Cloud Operations at Backblaze, to discuss how the cloud storage innovator is reshaping the industry landscape. From their origins as a consumer backup company to becoming a major player in enterprise cloud storage, Chris shares insights on AI workloads, the true cost of egress fees, and why your data doesn’t have to live in a walled garden. About Backblaze Backblaze started in 2007 with a simple mission: make storage so affordable it’s almost free. The company gained early notoriety for their DIY approach to storage infrastructure, with founders literally bending metal in apartments and conducting “gorilla storage purchasing” raids at Bay Area Best Buys and Fry’s Electronics to build their custom red storage pods. This scrappy, cost-conscious DNA remains central to the company’s identity today. In September 2015, Backblaze made their enterprise pivot with the launch of B2 Cloud Storage, entering the market at one-quarter the cost of Amazon S3. By December of that launch year, they had already attracted over 30,000 users. Today, Backblaze (NASDAQ: BLZE) manages approximately 4.7 exabytes of data across 310,000+ drives, serving over 500,000 customers in 175 countries. What sets Backblaze apart isn’t just their pricing—it’s their philosophy. While hyperscalers have built complex storage tiers with Byzantine billing structures, Backblaze offers one tier of hot storage with transparent, predictable pricing. Their recent push into AI workloads with B2 Overdrive demonstrates their ability to evolve with market demands while maintaining their core value proposition. About Chris Opat Chris Opat joined Backblaze as SVP of Cloud Operations in 2023, bringing over 25 years of experience in building teams and technology at startup and scale-up companies. Before Backblaze, he served as SVP of Platform Engineering and Operations at StackPath, specializing in edge technology and content delivery. His background includes extensive work with private equity portfolio companies, where he honed his skills in rapid transformation and growth. Chris describes himself as someone who thrives in “David vs. Goliath” scenarios, making Backblaze—with its mission to challenge the hyperscaler incumbents—a perfect fit. His passion for building exceptional technical teams and pushing technological boundaries aligns perfectly with Backblaze’s innovative culture. Interview Highlights The David vs. Goliath Mentality 3:15 Chris: “Nothing makes me happier than to watch a customer choose us over the incumbent competitors and have an exceptionally good experience. It’s easy to work for the incumbents and kind of win all the time. It feels so much better when you do it as the upstart that people don’t see coming.” Chris emphasized how Backblaze offers a fundamentally different partner experience compared to hyperscalers. While AWS, Azure, and Google Cloud may provide excellent services, they often lack the personal touch and flexibility that smaller customers need. At Backblaze, customers can directly influence product strategy and speak with decision-makers who shape the company’s direction. Egress Fees: The Hidden Tax of Cloud Storage 7:59 Chris: “Everybody who uses a hyperscaler is very familiar with the taxation of egress fees. It’s not a trivial subject… If you don’t know what you’re doing with a hyperscaler, egress fees can quickly sour your experience. They can drain your budget.” The discussion on egress fees revealed one of Backblaze’s key differentiators: their no egress fee policy through their Bandwidth Alliance partnerships. Chris shared a compelling example of a customer who saved hundreds of thousands of dollars on egress fees in their first year with Backblaze. This transparent pricing model contrasts sharply with hyperscalers, where egress costs can spiral out of control. When asked about recent announcements from Google and Amazon regarding “free” egress, Chris didn’t mince words: 10:07 Chris: “The devil’s in the details… The only way that they honor the free egress for repatriating your data is if you cancel all the services, and the cancellation timeframe… it’s something pretty brisk. It’s like 90 days or something.” AI Workloads: The New Frontier The conversation revealed how dramatically Backblaze’s customer base has evolved, particularly with AI workloads: 13:31 Chris: “We’ve got private connections to some customers where we’re serializing 400 gigabits per second line rate to them, so that they can very, very rapidly move libraries of data to tightly schedule back to back with perhaps maybe a GPU farm instance that they’ve got booked.” This represents a massive shift from their traditional backup use cases. The new B2 Overdrive product specifically addresses these high-bandwidth needs, offering performance levels that Chris claims most ...
    Exibir mais Exibir menos
    37 minutos
  • TCP Talks: The evolution of Finops & Why you should attend Finops-X
    May 26 2024
    Summary – Finops X In this conversation, Joe Daly and Rob Martin from the FinOps Foundation discuss the latest developments in the FinOps space and Finops-X. They talk about the evolution of FinOps practices, the growth of the FinOps community, and the importance of the Focus project, which aims to standardize billing data from different cloud providers. They also discuss the adoption of FinOps practices by SaaS companies and the future of the FinOps space. The conversation covers the updates and changes in the FinOps framework, including the addition of allied personas and the simplification of domains and capabilities. It also discusses the upcoming Finops-X conference and the value it provides for attendees, including deep and concrete content, networking opportunities, and career advancement. Keywords FinOps, FinOps Foundation, FinOps X conference, podcast, cloud providers, Focus project, billing data, cloud-agnostic, tool agnostic, open source project, SaaS companies, FinOps framework, allied personas, domains and capabilities, Finops-X conference, deep content, networking, career advancement, Finops-X Europe Takeaways FinOps practices have evolved to focus on making processes more operational and improving decision-making in businesses.The FinOps Foundation has seen significant growth, with over 100 members, including major cloud providers.The Focus project, an open billing standard, aims to consolidate billing data from different cloud providers and enable more effective cost allocation.The adoption of FinOps practices by SaaS companies is increasing, with a focus on consumption-based licensing management.The future of the FinOps space includes expanding the Focus project to include sustainability data and additional usage-based data. The FinOps framework has been updated to include allied personas and simplified domains and capabilities.Finops-X conference provides valuable content, networking opportunities, and career advancement for attendees.Finops-X Europe conference in Barcelona offers a focused event for the European market.The conversation also mentions the importance of small businesses attending the conference and the success stories of attendees. Sound Bites “How do I make these processes much more operational? How do I affect the broader decision-making going on in my business?”“The Focus project… will consolidate or specify how billing data should come from the different cloud providers.”“The Focus project… essentially handles the data ingestion problem that has plagued a lot of organizations early on.”“The two big changes that happened this year were the addition of a lot of allied personas.”“We’ve simplified those down into four key domains.”“What other things are you guys excited about for Finops-X?” About Joe Daily & Rob Martin Joe Daly is a Director of Community for the FinOps Foundation, which is kind of like sitting at the largest lunch table in Middle School, but with less vaping. He’s had illustrious careers as a CPA (the Statute of Limitations has past for all tax returns he prepared and he has let his CPA expire), Corporate Taxation, IT Finance & Accounting, IT Portfolio Management, a regrettable stint as Manager of Server Operations, and has started two teams that perform what has come to be known as FinOps. He lives in Columbus, OH and enjoys copying off Rob. Go Captains! Rob Martin is a FinOps Principal at the FinOps Foundation, which is kind of like being a Middle School Principal, but with less vaping. He’s had illustrious careers at Accenture, the US Department of Justice, Amazon Web Services, and Cloudability, and less lustrious jobs at a few other places. He now spends his time collecting, developing, and distributing FinOps content among the huge global community of people who deliver value from cloud. He lives in Leesburg, VA, and enjoys games (including the FinOps Boardgame!), hiking, and announcing for his son’s high school soccer team. Go Captains! Chapters 00:00 Introduction and Overview02:32 The Evolution of FinOps Practices05:19 The Growth of the FinOps Community06:18 The Importance of the Focus Project09:29 Adoption of FinOps Practices by SaaS Companies12:35 The Future of the FinOps Space24:29 The Value of Finops-X Conference28:29 Finops-X Europe: A Focused Event for the European Market29:32 Success Stories and Career Advancement at Finops-X Learn More: Finops FoundationFinops Foundation on TwitterFinops-XSubscribe to The Cloud Pod
    Exibir mais Exibir menos
    37 minutos
  • TCP Talks with Rackspace CTO of Public Cloud – Travis Runty
    May 7 2024

    For this special edition of TCP Talks, Justin and Jonathan are joined by Travis Runty, CTO of Public Cloud with Rackspace Technology. In today’s interview, they discuss being accidentally multi cloud, public vs private cloud, and cloud migration, and best practices when assisting clients with their cloud journeys.

    Background

    Rackspace Technology, commonly known as Rackspace, is a leading multi-cloud solutions provider headquartered in San Antonio, Texas, United States. Founded in 1998, Rackspace has established itself as a trusted partner for businesses seeking expertise in managing and optimizing their cloud environments.

    The company offers a wide range of services aimed at helping organizations navigate the complexities of cloud computing, including cloud migration, managed hosting, security, data analytics, and application modernization. Rackspace supports various cloud platforms, including AWS, Azure, and GCP, among others.

    Rackspace prides itself on its “Fanatical Experience” approach, which emphasizes delivering exceptional customer support and service. This commitment to customer satisfaction has contributed to Rackspace’s reputation as a reliable and customer-centric provider in the cloud computing industry.

    Meet Travis Runty, CTO of Public Cloud for Rackspace Technology

    Beginning his career with Rackspace as a Linux engineer, Travis has spent the last 15 years working his way through multiple divisions of the company, including 10 years in senior and director level positions. Most recently, Travis served as VP of Technical Support of Global Cloud Operations from 2020-2022.

    Travis is extremely passionate about building and leading high performance engineering teams and delivering innovative solutions. Most recently, as a member of their technology council, Travis wrote an article for Forbes – Building a Cloud-Savvy Workforce: Empowering Your Team for Success – where he discussed best practices for prioritizing workforce enablement, especially when it comes to training and transformation initiatives.

    Interview Notes:

    In the main show, TCP has been talking a lot about Cloud / hybrid cloud / multi-cloud and repatriating data back to on prem, and today’s guest knows all about those topics.

    Rackspace has had quite a few phases in their journey to public cloud – including building a data center in an unused mall, introducing managed services, creating partnerships with VMware, an attempt to go head to head with the hyperscalers, and then ultimately focusing on public cloud and instead partnering with the hyperscalers.

    Rackspace has both a focus on private and public cloud; when it comes to private cloud they focus mainly on VMware and OpenStack, whereas in the public cloud side, Rackspace partners with the hyperscalers to assist clients with their cloud journey.

    Quotes from today’s show

    Travis: “We want to make sure that when a customer goes on their public cloud journey, that they actually have a robust strategy that is going to be effective. From there, we’re able to leverage our professional services teams to make sure that they can realize that transformation, and hopefully there *is* a transformation, and it’s not just a lift and shift.”

    Travis: “A conflict that we continuously have to strike the balance of is when do we apply a cloud native solution, and where do we apply the Rackspace elements on top. The hyperscalers technology is the best there is, and we’re probably not going to create a better version of “x” than AWS does – nor do we want to.”

    Travis: “We favor cloud native. Every single time we’re going to favor the platform’s native solution, unless the customer has a really really strong opinion about being vendor locked. Which sometimes they do. And if that’s the case we can establish a solution that gives them that portability. But for right now, the customers are generally preferring cloud native solutions.”

    Exibir mais Exibir menos
    40 minutos
  • TCP Talks: Sandy Bird, Sonrai Security
    Apr 11 2024

    A bonus episode of The Cloud Pod may be just what the doctor ordered, and this week Justin and Jonathan are here to bring you an interview with Sandy Bird of Sonrai Security. There’s so much going on in the IAM space, and we’re really happy to have an expert in the studio with us this week to talk about some of the security least privilege specifics.

    Background

    Sonrai (pronounced Son-ree, which means data in Gaelic) was founded in 2017. Sonrai provides Cloud Data Control, and seeks to deliver a complete risk model of all identity and data relationships, which includes activity and movement across cloud accounts, providers, and third party data stores.

    Try it free for 14 days

    Start your free trial today

    Meet Sandy Bird, Co founder of Sonrai Security

    Sandy is the co-founder and CTO of Sonrai, and has a long career in the tech industry. He was the CTO and co-founder of Q1 Labs, which was acquired by IBM in 2011, and helped to drive IBM security growth as CTO for global business security there.

    Interview Notes:

    One of the big questions we start the interview with is just how has IAM evolved – and what kind of effect have those changes had on the identity models? Enterprise wants things to be least privilege, but it’s hard to find the logs. In cloud, however *most* things are logged – and so least privilege became an option.

    Sonrai offers the first cloud permissions firewall, which enables one click least privilege management, which is important in the current environment where the platforms operate so differently from each other. With this solution, you have better control of your cloud access, limit your permissions, attack surface, and automate least privilege – all without slowing down DevOps2.

    Is the perfect policy achievable? Sandy breaks it between human identities and workload identities; they’re definitely separate. He claims, in workload identities the perfect policy is probably possible. Human identity is hugely sporadic, however, it’s important to at least try to get to that perfect policy, especially when dealing with sensitive information. One of the more interesting data pieces they found was that less than 10% of identities with sensitive permissions actually used them – and you can use the information to balance out actually handing out permissions versus a one time use case.

    Sonrai spent a lot of time looking at new solutions to problems with permissions; part of this includes purpose-built integration, offering a flexible open GraphQL API with prebuilt integrations.

    Sonrai also offers continuous monitoring; providing ongoing intelligence on all the permission usage – including excess permissions – and enables the removal of unused permissions without any sort of disruptions. Policy automation automatically writes IAM policies tailored to access needs, and simplifies processes for teams.

    On demand access is another tool that gives on demand requests for permissions that are restricted with a quick and efficient process.

    Quotes from today’s show

    Sandy: “The unbelievably powerful model in AWS can do amazing things, especially when you get into some of the advanced conditions – but man, for a human to understand what all this stuff is, is super hard. Then you go to the Azure model, which is very different. It’s an allow first model. If you have an allow anywhere in the tree, you can do whatever is asked, but there’s this hierarchy to the whole thing, and so when you think you want to remove something you may not even be removing it., because something above may have that permission anyway. It’s a whole different model to learn there.”

    Sandy: “Only like 8% of those identities actually use the sensitive parts of them; the other 92 just sit in the cloud, never being used, and so most likely during that break loss scenario in the middle of the night, somebody’s troubleshooting, they have to create some stuff, and overpermission it . If we control this centrally, the sprawl doesn’t happen.”

    Sandy: There is this fear that if I remove this identity, I may not be able to put it back the way it was if it was supposed to be important… We came up with a secondary concept for the things that you were worried about… where we basically short circuit them, and say these things can’t log in and be used anymore, however we don’t delete the key material, we don’t delete the permissions. We leave those all intact.”

    Exibir mais Exibir menos
    40 minutos
  • TCP-Talks: Security & Observability with DataDog’s Andrew Krug
    Apr 12 2023
    Andrew Krug from Datadog

    In this episode, Andrew Krug talks about Datadog as a security observability tool, shedding light on some of its applications as well as its benefits to engineers.

    Andrew is the lead in Datadog Security Advocacy and Datadog Security Labs. Also a Cloud Security consultant, he started the Threat Response Project, a toolkit for Amazon Web Services first responders. Andrew has also spoken at Black Hat USA, DEFCON, re:Invent, and other platforms..

    DataDog Product Overview

    Datadog is focused on bringing security to engineering teams, not just security people. One of the biggest advantages of Datadog or other vendors is how they ingest and normalize various log sources. It can be very challenging to maintain a reasonable data structure for logs ingested from cloud providers.

    Vendors try to provide customers with enough signals that they feel they are getting value while trying not to flood them with unactionable alerts. Also, considering the cloud friendliness for the stack is crucial for clients evaluating a new product.

    Datadog is active in the open-source community and gives back to groups like the Cloud native computing foundation. One of their popular open-source security tools created is Stratus-red-team which simulates the techniques of attackers in a clean room environment. The criticality of findings is becoming a major topic. It is necessary when evaluating that criticality is based on how much risk applies to the business, and what can be done.

    One of the things that teams struggle with as high maturity DevOps is trying to automate incident handling or response to critical alerts as this can cause Configuration Drift which is why there is a lot of hesitation to fully automate things. Having someone to make hard choices is at the heart of incident handling processes.

    Datadog Cloud SIEM was created to help customers who were already customers of logs. Datadog SIEM is also very easy to use such that without being a security expert, the UI is simple. It is quite difficult to deploy a SIEM on completely unstructured logs, hence being able to extract and normalize data to a set of security attributes is highly beneficial. Interestingly, the typical boring hygienic issues that are easy to detect still cause major problems for very large companies. This is where posture management comes in to address issues on time and prevent large breaches.

    Generally, Datadog is inclined towards moving these detections closer to the data that they are securing, and examining the application run time in real-time to verify that there are no issues. Datadog would be helpful to solve IAM challenges through CSPM which evaluates policies. For engineering teams, the benefit is seen in how information surfaces in areas where they normally look, especially with Datadog Security products where Issues are sorted in order of importance.

    Security Observability Day is coming up on the 18th of April when Datadog products will be highlighted; the link to sign up is available on the Datadog Twitter page and Datadog community Slack. To find out more, reach out to Andrew on Twitter @andrewkrug and on the Datadog Security Labs website.

    Top Quotes

    • “I think that great security solutions…start with alerts that you are hundred percent confident as a customer that you would act on”
    • “When we talk about the context of ‘how critical is an alert?’ It is always nice to put that risk lens on it for the business”
    • “Humans are awesome unless you want really consistent results, and that’s where automating part of the process comes into play”
    • “More standardization always lends itself to better detection”
    Exibir mais Exibir menos
    28 minutos
  • TCP-Talks: Evolution of NoSQL with Couchbase CTO, Ravi Mayuram
    Mar 24 2023

    In this episode, Ravi Mayuram highlights the functionality of Couchbase as an evolutionary database platform, citing several simple day-to-day use cases and particular advantages of Couchbase.

    Ravi Mayuram is CTO of Couchbase. He is an accomplished engineering executive with a passion for creating and delivering game-changing products for startups as well as Fortune-500 industry-leading companies.

    Notes

    Couchbase set out to build a next-generation database. Data has evolved greatly with IT advancements. The goal was to build a database that will connect people to the newer technologies, addressing problems that relational systems did not have to solve. The fundamental shift is that earlier systems were internally focused, built for trained users but now the systems are built directly for consumers. This shift also plays out in the vast difference in the number of consumers now interacting with these systems compared to the fewer trained users previously interacting with the systems.

    One of the key factors that sets Couchbase apart is the No-SQL Database. It is a database that has evolved by combining five systems; a Cache and Key-value store, a Document store, a Relational document store, a Search system, and an Analytical system. Secondly, Couchbase performs well in the geo-distributed manner such that with one click, data is made available across availability zones. Lastly, all of this can be done at a large scale in seconds.

    Regarding the global database concept that Google talks about, a globally consistent database may not be needed by most companies. The performance will be the biggest problem as transaction speed will be considerably low. Couchbase does these transactions locally within the data center and replicates them on the other side. The main issue of relational systems is that they make you pay the price of every transaction no matter how minor, but with Couchbase, it is possible to pay only the cost only with certain crucial transactions.

    Edge has become a part of the enterprise architecture even such that people now have edge-based solutions. Two edges are emerging; the Network edge and the Tool edge where people are interfacing. Couchbase has built a mobile database available on devices, with sync capability.

    As a consumer, the primary advantage of bringing data closer to the consumer is the latency issue. Often, data has to go through firewalls and multiple steps which delays it but this is the benefit of Couchbase. The user simply continues to have access to the data while Couchbase synchronizes the data in the back.

    One of the applications of Couchbase in healthcare is insulin tracking. With many devices that monitor insulin which must work everywhere you go, Couchbase Lite does the insulin tracking, keeps the data even in the absence of a network, and later syncs it for review by healthcare professionals. This is also useful in operating rooms where the network is not accessible. The real benefit is seen when the data eventually gets back to the server and can be interpreted to make decisions on patient care.

    The Couchbase Capella Service runs in the cloud and allows clients to specify what data should be sent to the edge and what should not be. This offers privacy and security measures, such that even in the loss or damage of a device, the data is secure and can be recovered. To effectively manage edge in devices, a lot of problems must be addressed to make it easier.

    One of the concerns for anyone coming into Couchbase Capella is the expense of data extraction from the cloud, however, Couchbase is available on all three cloud providers. Also, with Couchbase, there is no need to keep replicating data as you can work on the data without moving it, which largely saves costs.

    Other use cases for Couchbase include information for flight bookings, flight crew management systems, hotel reservations, and credit card payments. To learn more, visit the Couchbase website. There is also a free trial for the Couchbase Capella Service.

    Top Quotes

    • “The modern database has to do more than what the old database did”
    • “Managing edge in devices is not an easy thing, and so you have to solve a lot of problems so it becomes easier”
    Exibir mais Exibir menos
    38 minutos
  • TCP-Talks: Revolutionizing Observability with New Relic featuring Daniel Kim
    Mar 2 2023
    Revolutionizing Observability with New Relic In this episode, Daniel explains a new strategy towards observability aimed at contextualizing large volumes of data to make it easier for users to identify the root cause of problems with their systems. Daniel Kim is a Principal Developer Relations Engineer at New Relic and the founder of Bit Project, a 501(c)(3) nonprofit dedicated to making tech accessible to under-served communities. His job is basically to get developers excited about Observability, and he hopes to inspire students to maximize their potential in tech through inclusive, accessible developer education. He is passionate about diversity and inclusion in tech, good food, and dad jokes. Show Notes First, it is important to differentiate between monitoring and observability. Monitoring is basically when a code is instrumented to send data to a backend, to give answers to preconceived questions. With Observability, the goal is to monitor your system so as to later ask questions that were not in mind during the instrumentation of the system. Hence, if something new comes up you can find the root cause without modifying the code. There are so many levels of things to check when troubleshooting to find the cause of a problem, and this is where observability comes in. There are different use cases for logs, metrics, and traces; Logs are files that record events, warnings, or errors however logs are ephemeral which means there is increased risk of losing a lot of data. A system needs to be in place to move logs to a central source. Another issue with logs is that it is poorly structured data. Logs are good to have as the last step of observability. Metrics and traces can however help to narrow down where to search in the logs to solve an issue. Metrics are measurements that reflect the performance or health of your applications. They give an overview of how the systems are doing but tend to not be very specific in finding the root cause of a problem; other forms of data have to be adopted to get a clear picture. This is where Traces come in. Traces are pieces of data that track a request as it goes through the system. Because of this, they can identify the root cause of an error or bottlenecks slowing down the system. However, they are very expensive and as such sampling is used when tracing but this reduces the accuracy of traces. Correlating information from logs, metrics, and traces gives a full clear picture for debugging to be carried out successfully. A lot of New Relic customers strive to get more pieces of data to get errors faster. To balance the right data at the right time with the right cost, the first step when collecting large amounts of data is to find out how your organization is leveraging the data. A quick audit of the data to identify useful data is helpful. This can be done monthly or quarterly. Unstructured logs are difficult to aggregate In the cloud native space, being able to be compatible with as many people as possible will determine the winners because there are many projects people use in production. Projects that are compatible with many other projects are the way forward. APM is still very useful to understand application performance and in the future, data from all sources will be correlated to figure out the cause of a problem. Getting value very early from the system involves having a solid infrastructure and installing APM. The real power of full stack observability is getting data from different parts of your stack so you can diagnose what part of your system is going wrong. Leveraging AI to make sense of large amounts of data for engineers is going to be a huge plus. A lot of vendors claim that their alert systems will automatically generate all alerts for you but this is not true because they would not know your team’s needs. It is ultimately up to your team to set up alerts that create an observability strategy. Those who invest time into setting this up get the most ROI from New Relic. Engineers need to figure out what metrics are important to them. About New Relic One: This was made to be a singular observability platform where people can correlate various pieces of data to get more context making the work easy for engineers. The goal was to help engineers to find the information they need as fast as possible, especially during a crisis. This kind of third-party solution is much more applicable for processing millions of logs or larger data, compared to native tools. It also provides a large amount of expertise around observability and curated experiences around machine-generated data. The future seems to have customers tilting towards open-source observability solutions. OpenTelemetry is one example of this, as it brings together all observability offerings in open source in a whole stack observability experience. Visit the New Relic website to learn more about it. To learn more about ways to use New Relic, check out the New Relic Blogs. Top ...
    Exibir mais Exibir menos
    26 minutos