Deep Learning
with Microsoft
Cognitive Toolkit
Barbara Fusinska
@BasiaFusinska
About me
Programmer
Machine Learning
Data Solutions Architect
@BasiaFusinska
Agenda
• Deep Learning
• Why CNTK?
• How to start?
• Components of the platform
• The future
Deep Learning?
Neural Networks
• Interconnected units (neurons)
• Activation signal(s)
• Information processing
• Learning involves adjustments to
the synaptic connections
Artificial Neural Network
Brief History of Neural Network
Big data
• Volumes
• Cheap storage
• Cloud
• Communities
• Crowdsourcing
GPU
• Matrices operations
• Code optimisation
• Dedicated computing
Computer Vision
• Tag images
• Categorizing images
• Detect human faces
• Face expressions
• Flag adult content
• Crop photos
• Optical character
recognition
Natural Language
Processing
• Sentiment analysis
• Topic extraction
• Named entity recognition
• Translation
• Automatic text answering
• Automatic text
summarisation
Deep Learning
Toolkit
• Easy to use
• Fast
• Flexible
• Production tested
• Multi GPU/multi-server
• Open source
Highly performant on large scale data
[s] Caffe CNTK MXNET TensorFlow Torch
fcn5 (512) 172 228 183 190 158
alexnet (128) 328 237 245 853 336
resnet (128) 6554 2496 4165 8831 5080
http://dlbench.comp.hkbu.edu.hk/
https://arxiv.org/pdf/1608.07249.pdf
Building blocks
• Simple components
• Complex layers
• Training
• Testing
• End-to-end solutions
Production
tested
• Cognitive Services
• Skype Translator
• Cortana
• Bing, Bing Ads
• Augmented Reality (object
recognition)
Age detection
• Smile
• Beard
• Glasses
Uber: Face
recognition
Smart
refrigerator
Liebherr
How to
contribute?
• Microsoft Contribution License
Agreement (CLA)
• Raising an issue
(https://github.com/Microsoft/CN
TK/issues)
• Forking the repository and making
a Pull Request
(https://github.com/Microsoft/CN
TK)
• https://docs.microsoft.com/en-
us/cognitive-toolkit/contributing-
to-cntk
Set up/
installation
• Installation on Windows/Linux machines
(https://docs.microsoft.com/en-
us/cognitive-toolkit/setup-cntk-on-your-
machine)
• Using Docker images
(https://docs.microsoft.com/en-
us/cognitive-toolkit/CNTK-Docker-
Containers)
• Binary setup
(https://docs.microsoft.com/en-
us/cognitive-toolkit/Setup-Linux-Python)
mnistTrain = [
action = "train"
# network builder block (from scratch or load)
BrainScriptNetworkBuilder = (
new ComputationNetwork
include "$ConfigDir$/OneHidden.bs"
)
# learner block (which training algorithm to use)
SGD = [
modelPath = "$ModelDir$/Hidden_Model.dnn"
epochSize = 60000
minibatchSize = 32
learningRatesPerMB = 0.1
maxEpochs = 30
]
# reader block (how to load features and labels)
reader = [
readerType = "CNTKTextFormatReader"
file = "$DataDir$/Train-28x28_cntk_text.txt"
input = [
features = [
dim = 784
format = "dense"
]
labels = [
dim = 10
format = "dense"
]
]
]
]
BrainScript
https://docs.microsoft.com/en-us/cognitive-toolkit/using-cntk-with-brainscript
APIs
• Python
(https://www.cntk.ai/pythondocs/)
• Abstraction for model definition,
compute, learning algorithms, data
reading
• Concise network definition
• Distributed, highly scalable training
• Efficient data interfaces
• C++
• Core computations, network
composition & training, data reading
• Model training and evaluation
• Driven from native code
• C#
• Evaluation related APIs
• Nuget packages
Getting started
• Python & BrainScript tutorials
(https://docs.microsoft.com/en-
us/cognitive-toolkit/tutorials)
CNN, LSTM, Reinforcement Learning,
ATIS, Fast R-CNN
• Python Examples
(https://docs.microsoft.com/en-
us/cognitive-toolkit/examples)
ResNet, SequenceClassification,
Sequence2Sequence, Language
Understanding
• Try it live with Azure Notebooks
(https://notebooks.azure.com/library/
cntkbeta2)
Setup on Azure
• Training on Data Science Virtual
Machine (Windows or Linux)
• Azure GPU – Azure Deep Learning
Extension
• Azure Batch – Dockerized CNTK
https://docs.microsoft.com/en-
us/cognitive-toolkit/CNTK-on-Azure
Training
# input variables denoting the features and label data
features = C.input((inputs), np.float32)
label = C.input((outputs), np.float32)
# Instantiate the feedforward classification model
my_model = Sequential ([
Dense(hidden_dimension, activation=C.sigmoid),
Dense(outputs)])
z = my_model(features)
loss = C.cross_entropy_with_softmax(z, label)
cerr = C.classification_error(z, label)
Perform training
# Instantiate the trainer object to drive the model training
lr_per_minibatch = learning_rate_schedule(0.125, UnitType.minibatch)
trainer = C.Trainer(z, (loss, cerr), [sgd(z.parameters, lr=lr_per_minibatch)])
# Run training
for i in range(num_minibatches_to_train):
train_features, labels = minibatch_data(minibatch_size)
# Specify the mapping of input variables in the model
trainer.train_minibatch({features : train_features, label : labels})
sample_count = trainer.previous_minibatch_sample_count
aggregate_loss += trainer.previous_minibatch_loss_average * sample_count
last_avg_error = aggregate_loss / trainer.total_number_of_samples_seen
Evaluate
# Loading test data in minibatches
for i in range(num_minibatches_to_test):
test_data = test_minibatch_data(test_minibatch_size)
eval_error = trainer.test_minibatch(test_data)
test_result += eval_error
# Check the predictions
out = C.softmax(z)
predicted_label_prob = [out.eval(test_data[i]) for i in range(len(test_data))]
pred = [np.argmax(predicted_label_prob[i]) for i in range(len(predicted_label_prob))]
gtlabel = [np.argmax(test_label[i]) for i in range(len(test_label))]
Algorithms
• Feed Forward
• Convolutional Neural
Network
• Recurrent Neural Network
• Long Short-Term Memory
• Sequence-to-Sequence
Convolutional Neural Network
Demo
Layers library
• Dense
• Convolution
• (Global)MaxPooling
• (Global)AveragePooling
• Dropout
• Embedding
• LSTM
• RNNUnit
Multiple GPUs
& machines
• Message Passing Interface
• License
https://github.com/Microsoft/CNTK/blob/
master/LICENSE.md
https://docs.microsoft.com/en-
us/cognitive-toolkit/cntk-1bit-sgd-license
• Parallel training:
• The same machine
• Across multiple computing nodes
• Algorithms
• DataParallelSGD
• BlockMomentumSGD
• ModelAveragingSGD
• DataParallelASGD
https://docs.microsoft.com/en-us/cognitive-toolkit/Multiple-GPUs-and-machines
The future of
Cognitive Toolkit
• Dynamic Networks
• Support for Model
Compression/Sparsity
• Partial Precision Training and
Evaluation
• Offline compilation techniques
• More language bindings
Keep in touch
BarbaraFusinska.com
@BasiaFusinska

Deep Learning with Microsoft Cognitive Toolkit

  • 1.
    Deep Learning with Microsoft CognitiveToolkit Barbara Fusinska @BasiaFusinska
  • 2.
    About me Programmer Machine Learning DataSolutions Architect @BasiaFusinska
  • 3.
    Agenda • Deep Learning •Why CNTK? • How to start? • Components of the platform • The future
  • 4.
  • 6.
    Neural Networks • Interconnectedunits (neurons) • Activation signal(s) • Information processing • Learning involves adjustments to the synaptic connections
  • 7.
  • 9.
    Brief History ofNeural Network
  • 10.
    Big data • Volumes •Cheap storage • Cloud • Communities • Crowdsourcing
  • 12.
    GPU • Matrices operations •Code optimisation • Dedicated computing
  • 13.
    Computer Vision • Tagimages • Categorizing images • Detect human faces • Face expressions • Flag adult content • Crop photos • Optical character recognition
  • 14.
    Natural Language Processing • Sentimentanalysis • Topic extraction • Named entity recognition • Translation • Automatic text answering • Automatic text summarisation
  • 15.
    Deep Learning Toolkit • Easyto use • Fast • Flexible • Production tested • Multi GPU/multi-server • Open source
  • 16.
    Highly performant onlarge scale data [s] Caffe CNTK MXNET TensorFlow Torch fcn5 (512) 172 228 183 190 158 alexnet (128) 328 237 245 853 336 resnet (128) 6554 2496 4165 8831 5080 http://dlbench.comp.hkbu.edu.hk/ https://arxiv.org/pdf/1608.07249.pdf
  • 17.
    Building blocks • Simplecomponents • Complex layers • Training • Testing • End-to-end solutions
  • 18.
    Production tested • Cognitive Services •Skype Translator • Cortana • Bing, Bing Ads • Augmented Reality (object recognition)
  • 19.
    Age detection • Smile •Beard • Glasses
  • 20.
  • 21.
  • 22.
    How to contribute? • MicrosoftContribution License Agreement (CLA) • Raising an issue (https://github.com/Microsoft/CN TK/issues) • Forking the repository and making a Pull Request (https://github.com/Microsoft/CN TK) • https://docs.microsoft.com/en- us/cognitive-toolkit/contributing- to-cntk
  • 23.
    Set up/ installation • Installationon Windows/Linux machines (https://docs.microsoft.com/en- us/cognitive-toolkit/setup-cntk-on-your- machine) • Using Docker images (https://docs.microsoft.com/en- us/cognitive-toolkit/CNTK-Docker- Containers) • Binary setup (https://docs.microsoft.com/en- us/cognitive-toolkit/Setup-Linux-Python)
  • 24.
    mnistTrain = [ action= "train" # network builder block (from scratch or load) BrainScriptNetworkBuilder = ( new ComputationNetwork include "$ConfigDir$/OneHidden.bs" ) # learner block (which training algorithm to use) SGD = [ modelPath = "$ModelDir$/Hidden_Model.dnn" epochSize = 60000 minibatchSize = 32 learningRatesPerMB = 0.1 maxEpochs = 30 ] # reader block (how to load features and labels) reader = [ readerType = "CNTKTextFormatReader" file = "$DataDir$/Train-28x28_cntk_text.txt" input = [ features = [ dim = 784 format = "dense" ] labels = [ dim = 10 format = "dense" ] ] ] ] BrainScript https://docs.microsoft.com/en-us/cognitive-toolkit/using-cntk-with-brainscript
  • 25.
    APIs • Python (https://www.cntk.ai/pythondocs/) • Abstractionfor model definition, compute, learning algorithms, data reading • Concise network definition • Distributed, highly scalable training • Efficient data interfaces • C++ • Core computations, network composition & training, data reading • Model training and evaluation • Driven from native code • C# • Evaluation related APIs • Nuget packages
  • 26.
    Getting started • Python& BrainScript tutorials (https://docs.microsoft.com/en- us/cognitive-toolkit/tutorials) CNN, LSTM, Reinforcement Learning, ATIS, Fast R-CNN • Python Examples (https://docs.microsoft.com/en- us/cognitive-toolkit/examples) ResNet, SequenceClassification, Sequence2Sequence, Language Understanding • Try it live with Azure Notebooks (https://notebooks.azure.com/library/ cntkbeta2)
  • 27.
    Setup on Azure •Training on Data Science Virtual Machine (Windows or Linux) • Azure GPU – Azure Deep Learning Extension • Azure Batch – Dockerized CNTK https://docs.microsoft.com/en- us/cognitive-toolkit/CNTK-on-Azure
  • 28.
    Training # input variablesdenoting the features and label data features = C.input((inputs), np.float32) label = C.input((outputs), np.float32) # Instantiate the feedforward classification model my_model = Sequential ([ Dense(hidden_dimension, activation=C.sigmoid), Dense(outputs)]) z = my_model(features) loss = C.cross_entropy_with_softmax(z, label) cerr = C.classification_error(z, label)
  • 29.
    Perform training # Instantiatethe trainer object to drive the model training lr_per_minibatch = learning_rate_schedule(0.125, UnitType.minibatch) trainer = C.Trainer(z, (loss, cerr), [sgd(z.parameters, lr=lr_per_minibatch)]) # Run training for i in range(num_minibatches_to_train): train_features, labels = minibatch_data(minibatch_size) # Specify the mapping of input variables in the model trainer.train_minibatch({features : train_features, label : labels}) sample_count = trainer.previous_minibatch_sample_count aggregate_loss += trainer.previous_minibatch_loss_average * sample_count last_avg_error = aggregate_loss / trainer.total_number_of_samples_seen
  • 30.
    Evaluate # Loading testdata in minibatches for i in range(num_minibatches_to_test): test_data = test_minibatch_data(test_minibatch_size) eval_error = trainer.test_minibatch(test_data) test_result += eval_error # Check the predictions out = C.softmax(z) predicted_label_prob = [out.eval(test_data[i]) for i in range(len(test_data))] pred = [np.argmax(predicted_label_prob[i]) for i in range(len(predicted_label_prob))] gtlabel = [np.argmax(test_label[i]) for i in range(len(test_label))]
  • 31.
    Algorithms • Feed Forward •Convolutional Neural Network • Recurrent Neural Network • Long Short-Term Memory • Sequence-to-Sequence
  • 32.
  • 33.
    Layers library • Dense •Convolution • (Global)MaxPooling • (Global)AveragePooling • Dropout • Embedding • LSTM • RNNUnit
  • 34.
    Multiple GPUs & machines •Message Passing Interface • License https://github.com/Microsoft/CNTK/blob/ master/LICENSE.md https://docs.microsoft.com/en- us/cognitive-toolkit/cntk-1bit-sgd-license • Parallel training: • The same machine • Across multiple computing nodes • Algorithms • DataParallelSGD • BlockMomentumSGD • ModelAveragingSGD • DataParallelASGD https://docs.microsoft.com/en-us/cognitive-toolkit/Multiple-GPUs-and-machines
  • 35.
    The future of CognitiveToolkit • Dynamic Networks • Support for Model Compression/Sparsity • Partial Precision Training and Evaluation • Offline compilation techniques • More language bindings
  • 37.