README
This repository contains our customized version of TensorFlow which contains the dynamic scaling mechanism of AntMan paper (OSDI'20).
The modification of TensorFlow is mostly in three components: memory allocator, executor, and interfaces. To enable dynamic universal memory, BFCAllocator is modified to introduce an adjustable upper limit for memory. The memory allocator keeps track of the total bytes of memory allocation and triggers out-of-memory when total bytes exceed the upper limit. A new universal memory allocator, GPUVMemAllocator, is also added to wrap the GPU memory allocator and host memory allocator (i.e., using cudaHostMalloc for memory allocation). When a memory allocation is triggered by the request of a tensor, GPUVMemAllocator tries to allocate the memory using the GPU memory allocator and treats the CPU memory allocator as a backup if there is insufficient GPU memory left over. Note that, the GPUVMemAllocator maintains a set data structure that records the pointers of memory regions allocated by GPU, which is used to classify the memory pointers for de-allocation.
To enable dynamic computation unit scaling, a GpuOpManager with an operator processing queue, which runs in a standalone thread, is introduced in TensorFlow. The operator executor of TensorFlow is modified accordingly to insert GPU operators to GpuOpManager queue in order so as to dedicate the execution of GPU operators to it. GpuOpManager may delay the actual execution of the GPU operators based on a limited percentage of the computation capacity.
The statistics of memory usage patterns and the execution information are aggregated for the local coordinator. The DL frameworks and local coordinator communicate through the file system. They both have a monitor thread to check the file for receiving either job statistics or control signals. To minimize the overhead of memory management, the dynamic scaling of memory is triggered at the mini-batch boundaries (end of session.run()).
Documentation |
 |
TensorFlow is an end-to-end open source platform
for machine learning. It has a comprehensive, flexible ecosystem of
tools,
libraries, and
community resources that lets
researchers push the state-of-the-art in ML and developers easily build and
deploy ML powered applications.
TensorFlow was originally developed by researchers and engineers working on the
Google Brain team within Google's Machine Intelligence Research organization for
the purposes of conducting machine learning and deep neural networks research.
The system is general enough to be applicable in a wide variety of other
domains, as well.
TensorFlow provides stable Python
and C++ APIs, as well as
non-guaranteed backwards compatible API for
other languages.
Keep up-to-date with release announcements and security updates by subscribing
to
announce@tensorflow.org.
See all the mailing lists.
Install
See the TensorFlow install guide for the
pip package, to
enable GPU support, use a
Docker container, and
build from source.
To install the current release:
$ pip install tensorflow
The tensorflow
package also includes GPU support on Linux and Windows.
If package size is a concern, CPU-only packages can be installed with:
$ pip install tensorflow-cpu
Nightly binaries are available for testing using the
tf-nightly and
tf-nightly-gpu packages on PyPi.
Try your first TensorFlow program
$ python
>>> import tensorflow as tf
>>> tf.enable_eager_execution()
>>> tf.add(1, 2).numpy()
3
>>> hello = tf.constant('Hello, TensorFlow!')
>>> hello.numpy()
'Hello, TensorFlow!'
For more examples, see the
TensorFlow tutorials.
Contribution guidelines
If you want to contribute to TensorFlow, be sure to review the contribution
guidelines. This project adheres to TensorFlow's
code of conduct. By participating, you are expected to
uphold this code.
We use GitHub issues for
tracking requests and bugs, please see
TensorFlow Discuss
for general questions and discussion, and please direct specific questions to
Stack Overflow.
The TensorFlow project strives to abide by generally accepted best practices in open-source software development:

Continuous build status
Official Builds
Build Type |
Status |
Artifacts |
Linux CPU |
 |
pypi |
Linux GPU |
 |
pypi |
Linux XLA |
 |
TBA |
MacOS |
 |
pypi |
Windows CPU |
 |
pypi |
Windows GPU |
 |
pypi |
Android |
 |
 |
Raspberry Pi 0 and 1 |
 |
Py2 Py3 |
Raspberry Pi 2 and 3 |
 |
Py2 Py3 |
Build Type |
Status |
Artifacts |
Linux AMD ROCm GPU Nightly |
 |
Nightly |
Linux AMD ROCm GPU Stable Release |
 |
Release |
Linux s390x Nightly |
 |
Nightly |
Linux s390x CPU Stable Release |
 |
Release |
Linux ppc64le CPU Nightly |
 |
Nightly |
Linux ppc64le CPU Stable Release |
 |
Release |
Linux ppc64le GPU Nightly |
 |
Nightly |
Linux ppc64le GPU Stable Release |
 |
Release |
Linux CPU with Intel® MKL-DNN Nightly |
 |
Nightly |
Linux CPU with Intel® MKL-DNN Supports Python 2.7, 3.4, 3.5, and 3.6 |
 |
1.13.1 pypi |
Red Hat® Enterprise Linux® 7.6 CPU & GPU Python 2.7, 3.6 |
 |
1.13.1 pypi |
Resources
Learn more about the
TensorFlow community and how to
contribute.
License
Apache License 2.0