DevConf.us 2019 is the 2nd annual, free, Red Hat sponsored technology conference for community project and professional contributors to Free and Open Source technologies held at the Boston University in the historic city of Boston, USA.
When: Thursday, August 15 to Saturday, August 17, 2018
As IT operations become more agile and complex, at the same time the need to enhance operational efficiency and intelligence grows. Monitoring applications and kubernetes clusters with Prometheus has become quite common. Yet identifying relevant metrics and thresholds for your setup is getting harder.
In this talk, Marcel will show the tooling used to collect and store metrics gathered by Prometheus for the long term. Then analyze those on a large scale for extracting trends and seasonality but also forecasting of expected values for a given metric. Finally, he will integrate the predicted metrics back into the Prometheus monitoring and alerting stack to enable dynamic thresholding and anomaly detection.
Marcel Hild has 25+ years of experience in open source business and development. He co-founded a Linux consulting company, worked as a freelance developer, a Solution Architect for Red Hat, and core Developer for Cloudforms, a Hybrid Cloud Management tool. Now he researches the topic... Read More →
Thursday August 15, 2019 10:50 - 11:35 EDT
Terrace LoungeGS Union, BU
The overwhelming majority of data collected by enterprises is unlabeled data. Understanding hidden structure in data is therefore a central task for machine learning. While the field is still not nearly as mature and structured as supervised learning ("predictive analytics"), there are multiple conceptual frameworks and techniques that can be used to attack such problems. This talk introduces some deep-learning based techniques that have shown promising performance.
Used by Facebook, Netflix, Twitter, Uber, Lyft, and many others, Presto has become a ubiquitous solution for running fast SQL analytics across disparate data sources. Presto is an open source distributed SQL query engine widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. These data sources may include Ceph, S3, Google Cloud Storage, Azure Storage, Hadoop/HDFS, relational database systems such as PostgreSQL, and non-relational systems such as Apache Kafka. Presto’s connector based architecture allow you to query virtually anything. In the first part of the talk we will focus on what Presto is, its background, and its architecture. In the second part of the talk we will learn about Presto’s cloud native capabilities using Red Hat OpenShift and Kubernetes. Kubernetes reduces the burden and complexity of configuring, deploying, managing, and monitoring containerized applications. To achieve these capabilities with Presto, Red Hat and Starburst partnered to provide the Presto Kubernetes Operator and the Presto Container on OpenShift. The talk will also include a live demo of using Presto on Kubernetes and showing SQL federation of data between Ceph and other data sources. After the talk, you will have a good understanding of Presto basics and ready to participate in the Presto open source community as well as try Presto on your own.
As data is exponentially growing in organizations, there is an increasing need to consolidate silos of information into a single source of truth, a 'Data Lake' to feed hungry Analytics and Machine Learning Engines that can gather insight at scale. In this talk, we will provide an overview of the Open Data Hub architecture. Then we will highlight a current data science use case leveraging Red Hat OpenShift, Ceph Storage, and analytics with Spark and SKLearn on JupyterHub.
Sherard Griffin is a Senior Manager at Red Hat. His primary responsibility is the architecture and development an enterprise-grade AI-as-a-service platform on Kubernetes. Sherard is also responsible for the deployment of Red Hat’s internal AI-as-a-service platform where hundreds... Read More →
Anish is an engineering manager at Red Hat in the OpenShift AI organization. He is working on making machine learning easier for the wider community by building a platform out with cloud capabilities at the core. Most recently, his interests have been focused on the Distributed Workloads... Read More →
Data scientists and machine learning (ML) engineers rely heavily on creating workflows to train, verify, and deploy ML models. Argo is a cloud-native workflow management tool that enables the creation of sophisticated native Kubernetes ML workflows. Kubeflow is an ML toolkit that incorporates best-of-breed ML projects (for example, TensorFlow) as well as providing important infrastructure such as hyperparameter tuning. The Open Data Hub (ODH) is a scalable data lake platform that provides tools such as distributed Spark and Ceph data store. In this presentation we will explore workflow features available using Argo running as a component of Kubeflow and integrated with ODH. Presentation will include a live demonstration of ML workflows using Kubeflow, Spark and Ceph.
Pete MacKinnon is a Principal Software Engineer in the AI Center of Excellence at Red Hat. He is actively involved in the Kubeflow and Open Data Hub open source projects. He works closely with Red Hat customers and partners to successfully bring their machine learning and analytics... Read More →
Senior Software Engineer at Red Hat AICOE team., Red Hat
Senior Software Engineer at Red Hat AICOE team with many years of advanced R&D experience in the telecommunications industry. Juana's focus is broad and includes AI/ML Platform, wireless communication networks, embedded/mobile devices, cloud services and data analysis.
Thursday August 15, 2019 14:40 - 15:25 EDT
Terrace LoungeGS Union, BU
Machine learning is more complex than traditional software development. We all know that fetching the right parameters for tuning a machine learning model is quintessential in finding the optimal model.
Have multiple machine learning models with different parameters running, but do not have a platform to track each of these models? MLFlow is the new open source platform which allows data scientists, product engineers, developers and large organizations to manage the ML lifecycle, including experimentation, reproducibility and deployment.
Attend this talk to learn how to: 1. Use MLFlow for experiment tracking 2. Perform hyperparameter model tuning with MLFlow 3. Run, compare and visualize the metrics/parameters of multiple runs of your machine learning model with MLFlow
Hema Veeradhi is a Senior Data Scientist working in the Emerging Technologies team part of the office of the CTO at Red Hat. Her work primarily focuses on implementing innovative open AI and machine learning solutions to help solve business and engineering problems. Hema is a staunch... Read More →
Currently focused on developing analytics platform on OpenShift and leveraging Open Source ML Frameworks: Apache Spark, Tensorflow and more. Designing high performance and scalable ML platform that exposes metrics through cloud-native technology: Prometheus and Kubernetes.
Thursday August 15, 2019 15:50 - 16:35 EDT
Terrace LoungeGS Union, BU
SEUSS is custom operating system we spliced into the backend of a high throughput distributed Serverless platform, Apache OpenWhisk. SEUSS uses an alternative isolation mechanism to containers, Library Operating Systems (LibOSs). LibOSs enable a lightweight snapshotting technique. Snapshotting LibOSs enables two counterintuitive results: 1) although LibOSs inherently replicate system state, SEUSS can cache multiplicatively more functions on a node; 2) although LibOSs can suffer bad "first run" performance, SEUSS is able to reduce cold start times by orders of magnitude. By increasing sharing and decreasing deterministic bringup, SEUSS radically reduces the amount of hardware and cycles required to run a FaaS platform.
As log data continues to grow, is it possible to automate analyzing logs to find anomalies and errors quickly with machine learning? Log anomaly detection is at its core an unsupervised natural language processing (NLP) Machine Learning problem that can be difficult to validate and tune in a production setting. However, if there was a system that could incorporate minimal user feedback during model training, we could offset some of these challenges.
In this talk, you will learn about the lessons learned from building log anomaly detection system in production giving specific emphasis to the technical challenges faced with, scalability, implementing a human-in-the-loop ML system and Integrating ETL with unsupervised ML to detect anomalies in application logs
Michael Clifford is a Data Scientist at Red Hat working in the Office of the CTO on Emerging Technologies, where he works primarily on exploring tools, methodologies and use cases for cloud native data science.
Currently focused on developing analytics platform on OpenShift and leveraging Open Source ML Frameworks: Apache Spark, Tensorflow and more. Designing high performance and scalable ML platform that exposes metrics through cloud-native technology: Prometheus and Kubernetes.
Thursday August 15, 2019 17:30 - 18:15 EDT
Terrace LoungeGS Union, BU
The Open Data Hub (ODH) is a scalable data lake platform for data scientists and their machine learning (ML) activities. We have an internal deployment of ODH that processes RHEL build logs and test results. It aggregates and replicates the data from internal and upstream CI systems. AIOps then uses ML techniques to analyze the data, specifically for anomaly detection. The central component is Apache Kafka deployed across multi data centers on a common OpenShift platform which ensures high availability of the core of the ODH system. The presenters will discuss this use case where Kafka is deployed as a caching tool in a highly available and horizontally scalable architecture running on OpenShift, processing hundreds of gigabytes of log data per minute and thousands of messages per second.
Pete MacKinnon is a Principal Software Engineer in the AI Center of Excellence at Red Hat. He is actively involved in the Kubeflow and Open Data Hub open source projects. He works closely with Red Hat customers and partners to successfully bring their machine learning and analytics... Read More →
Softwate Engineer at the AI Center of Excellence at Red Hat, Red Hat Inc.
Hi I am a Software Engineer with the AICoE at Red Hat. Before this I did my Masters in Computer Science @ Boston University. At Redhat I work as a Data Engineer which involves ferrying massive amounts of data across systems and I also work on monitoring a
Friday August 16, 2019 09:00 - 09:45 EDT
East BalconyGS Union, BU
A Developer or Quality Engineer (QE) performs a multitude of software engineering tasks like searching lengthy documentations such as functional specs in order to get condensed information, sifting through logs to triage bugs or to find anomalies, analyzing bugs to determine if they are duplicate or not, analyzing test failure logs to determine false positives, etc. In this talk, will detail AI-Library, an open source machine learning framework built on OpenShift, that contains well known machine learning models and solutions to common software engineering use cases. We will show how developers or QE can leverage this framework to solve software engineering problems in an intelligent and automated manner thereby increasing their productivity.
Senior Software Engineer (QE and Analysis), Red Hat, RH - Raleigh - Red Hat Tower
Dr. Prasanth Anbalagan is a Senior Software Engineer (QE and Analysis) on the Artificial Intelligence Center of Excellence Team at Red Hat. As a member of AI team at Red Hat, Prasanth focuses on development of ML services and tools as part of an Analytics, Machine Learning and AI... Read More →
Friday August 16, 2019 09:50 - 10:35 EDT
East BalconyGS Union, BU
Streaming platforms have emerged as a popular, new trend, but what exactly is a streaming platform? Part messaging system, part Hadoop made fast, part fast ETL and scalable data integration, with Apache Kafka at the core, streaming platforms offer an entirely new perspective on managing the flow of data. This talk will explain what a streaming platform such as Apache Kafka is and some of the use cases and design patterns around its use. Moreover, this talk will also present and answer a set of random -- but recurring -- questions from the community about Apache Kafka.
Ricardo is a Developer Advocate at Confluent, the company founded by the creators of Apache Kafka. He has over 21 years of experience working with software engineering, where he specialized in different types of distributed systems such as integration, SOA, NoSQL, messaging, API management... Read More →
Friday August 16, 2019 11:00 - 11:45 EDT
East BalconyGS Union, BU
Elegant solutions often synthesize known facts with a small amount of new effort. To date, we've struggled to get computing machines involved in much elegant problem solving. Considering constraints like budget caps and the polar ice caps, this lack of elegance becomes more than an aesthetic issue.
In this talk I'll present ASC and SEUSS, two systems designed to reduce this new effort. ASC, a Harvard/BU collaboration, attempts to auto-parallelize single threaded workloads, reducing new effort required from programmers to achieve wall clock speedup.
SEUSS is custom operating system we spliced into the backend of a high throughput distributed Serverless platform, Apache OpenWhisk. SEUSS uses an alternative isolation mechanism to containers, Library Operating Systems (LibOSs). LibOSs enable a lightweight snapshotting technique. Snapshotting LibOSs enables two counterintuitive results: 1) although LibOSs inherently replicate system state, SEUSS can cache multiplicatively more functions on a node; 2) although LibOSs can suffer bad ""first run"" performance, SEUSS is able to reduce cold start times by orders of magnitude. By increasing sharing and decreasing deterministic bringup, SEUSS radically reduces the amount of hardware and cycles required to run a FaaS platform.
In this talk we will learn how OpenShift can support data science and ML workflows for Tensorflow application developers.We will learn how to use OpenShift to be more productive while training and deploying tensorflow models.We will look at various model formats, tools available for developers and patterns useful for both Data scientists and devops engineers.
ML systems are well-positioned to analyze natural language workloads and discover actionable insights in a DevOps environment. However, this presents a challenge when it comes to narrowing down to a system which can generalize to a wide variety of use cases. We discuss the limitations encountered in use of sentiment analysis across various artifacts in a DevOps environment. Also, we show how we improved the service through continuous learning and evolved the system to learn, adapt and produce desirable outcomes for a multitude of use cases. We share the lessons learned while going through the process and demonstrate usage of the framework with examples. Through the presentation, developers will quickly learn how to leverage and implement sentiment analysis into their existing environment.
Oindrilla is a Senior Data Scientist at Red Hat, in the Office of the CTO working on emerging trends and research in ML and AI. She works on evaluating new tools, platforms, and methodologies in the open source Data Science ecosystem, for enhancing Red Hat products and internal services... Read More →
Friday August 16, 2019 14:00 - 14:45 EDT
East BalconyGS Union, BU
Have a great idea for a data science experiment but don't have the hardware to run it? The MOC and Red Hat have partnered to deploy the Open Data Hub into the MOC giving you access to hardware and support required for leading edge experiments.
The MOC IaaS platform combined with OpenShift and current data science development tools provides you with an alternative to using public clouds to execute your experiments.
Attend this talk to learn about: - What the Massachusetts Open Cloud and Open Cloud Exchange is - Current projects running in the MOC - Running your project in the MOC
Project ChRIS is an open source initiative of Boston Children's Hospital, Boston University, Mass Open Cloud and Red Hat to democratize access to and development of medical image processing software and leverage OpenShift based containers to optimize performance. Today, processing a single set of images takes about ten hours. Our goal is to reduce image-processing time by enabling GPU hardware acceleration and comparing the performance between CPU and GPU based computing environments. We'll share some strategies for tuning the performance of workloads. Learn about Project ChRIS here: https://red.ht/2D6XNx7 And code here: https://bit.ly/2FXMSGJ No limit on attendees Familiarity with OpenShift, Tensorflow helpful but not required
Parul Singh is a Senior Software Engineer in the emerging technologies group within the Red Hat Office of the CTO. She is responsible for researching emerging technology trends and developing cloud-native prototypes that address the identified challenges and opportunities and inform... Read More →
Engineering Manager, Senior Principal Software Engineer, Red Hat Continuous Productization
Senior Principal Software Engineer & Engineering Manager at Red Hat working on Continuous Productization technologies and a contributor to Project ChRIS. He's also a graduate of the BU BA/MA program in Computer Science and an avid maker, traditional woodworker, blogger and author... Read More →
Friday August 16, 2019 16:00 - 16:45 EDT
East BalconyGS Union, BU
The human brain has many capabilities thanks to its network structure that allows transferring information among neurons in order to perform a specific action or multiple actions. Using Machine Learning (ML) we are focusing on the learning capability, but the brain uses its network also to store information in the memory. Nowadays many fields (e.g. medicine, biology, security, space) rely on graph structures to store data with semantic and context specific to that domain. Graph neural networks (GNNs) allow machines to learn from this kind of structure, taking a little step closer to mimic the behavior and architecture of the human brain. In this talk, we will explore this type of neural networks that take graphs as input, showing their capabilities, issues and applications.
Senior Data Scientist/Senior Software Engineer, Thoth Team, AICoE, Red Hat
Francesco has passion for AI, Software and Space, all developed Open Source. He previously worked at the European Space Agency (ESA) on his PhD topic mixing AI and the space field. He recently joined AICoE at Red Hat and he is part of the Thoth team.
Friday August 16, 2019 16:50 - 17:35 EDT
East BalconyGS Union, BU