Loading…
May 2-4, 2018 - Copenhagen, Denmark
Click Here For Information & Registration
View analytic
Tuesday, May 1 • 17:40 - 17:45
Lightning Talk: Scaling Distributed Deep Learning with Service Discovery: How CoreDNS Helps Distributed TensorFlow Tasks - Yong Tang, Infoblox Inc. (Intermediate Skill Level) (Slides Attached)

Sign up or log in to save this to your schedule and see who's attending!

Feedback form is now closed.
Training models with modern deep learning architecture is often computationally intensive and requires an efficient distributed system at scale. Such systems in distributed machine learning community often have special requirements and may involve additional efforts.

This talk discusses the usage of CoreDNS for service discovery on distributed TensorFlow clusters for resolving deep learning problems.

While CoreDNS has been widely used for service discovery in Kubernetes, its unique plugin based design allows CoreDNS to be easily extended and deployed in non-traditional distributed systems as well.

Deployed on cloud (AWS), our distributed TensorFlow clusters have been greatly helped by CoreDNS for robustness against partial node failures. The deployment has also been simplified for non-DevOps (e.g., machine learning researchers) to launch and execute deep learning tasks at great ease.

Speakers
YT

Yong Tang

Principal Software Engineer, Infoblox Inc.
Yong Tang is a Principal Software Engineer at CTO Office in Infoblox Inc. He works on CoreDNS at Infoblox for the open source community, with a focus on service discovery and Kubernetes integration. He also works on different machine learning projects in Infoblox. Yong Tang received... Read More →



Tuesday May 1, 2018 17:40 - 17:45
Auditorium 10-12