big data & mixed-integer (nonlinear) programming · 2016. 2. 1. · big data &...

48
Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work with Marie-Claude C ˆ ot´ e (JDA) Canada Excellence Research Chair ´ Ecole Polytechnique de Montr´ eal, Qu ´ ebec, Canada [email protected] Data Science for Whole Energy Systems @ Alan Turing Institute Edinburgh, January 29, 2016 Andrea Lodi (CERC, Polytechnique Montr´ eal) Big Data & Integer Programming ATI 2016 1 / 24

Upload: others

Post on 24-Feb-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Big Data & Mixed-Integer (Nonlinear) Programming

Andrea Lodijoint work with Marie-Claude Cote (JDA)

Canada Excellence Research ChairEcole Polytechnique de Montreal, Quebec, Canada

[email protected]

Data Science for Whole Energy Systems @ Alan Turing InstituteEdinburgh, January 29, 2016

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 1 / 24

Page 2: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Outline

1 Two of my favorite examples of Big Data

2 Something I do find interesting in Big Data:1 New (business) models2 Formulating and solving integrated models

3 The role of learning:1 An example in Retail2 Machine Learning paradigm

4 Machine Learning and Mathematical Optimization:1 The role of discrete decisions2 More sophisticated nonlinear models / algorithms

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 2 / 24

Page 3: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 1: automatic data collection (aka nowhere to hide)

A face recognition system has been put inplace in a mall somewhere in the US.

Main purpose of the system was security.

10/21/15, 11:25 PMNowhere To Hide: FBI To Roll-Out $1 Billion Facial Recognition System

Page 1 of 3http://newrisingmedia.com/all/2012/9/10/nowhere-to-hide-fbi-to-roll-out-1-billion-facial-recognition.html

Nowhere To Hide: FBI To Roll-Out $1 BillionFacial Recognition System

September 10, 2012

By: Jason England

Tags: CCTV, DNA,

FBI, Facial

Recognition, Minority

Report, NGI, Next

Generation

Identification, Pre-

Crime, Camera,

Surveillance

Category: Tech

Comment

The FBI has revealed its plans to launch a $1 billion Minority Report-like facial recognition system across the United States sounprecedented in scope that it will be able to be used to identify criminals with greater than 90 percent accuracy.

Promising to be an upheaval of the national fingerprint database, it’s all part of the billion dollar Next Generation Identification (NGI) programmethat is set to add mugshots of citizens to the database, as well as biometric data such as DNA information, iris scans and voice recognition data toexpand the tool-set available to authorities.

According to New Scientist, a ‘handful’ of states have already started to upload their photos,presumably of known criminals, to aid the pilot programme. The facial recognition system works inone of two ways. Law enforcement agencies can either take an image of a person of interest andcompare that against the national repository of images held by the FBI to produce a list of potential'hits', fairly simple. Or, and this is the most fascinating application, footage from security cameras canpotentially be fed, in real-time, to the software whereby a mark can be picked out in a crowd andfollowed through the streets. Remind you of anything? Perhaps you remember the scene in MinorityReport where surveillance cameras alert the authorities of Tom Cruise's whereabouts after catching a

glimpse of his face?

Don't expect for a moment the roll-out of such a powerful system will happen without a hitch, however. The FBI surely has a duty to US citizens toexplain how such power will be implemented without being abused, nor one that will be seen to encroach on our most basic human rights. So far,the bureau has been pilot-testing the system since February of this year, and pins down 2014 as a possible time-frame to roll out the programmenationwide.

Richard Birkett

Home Technology Social Media Science

Movies Music Games

New Rising

Media

New Rising Media is the definitive source of technology, social media, science,

movies, music and games news in the UK.

More

After collecting data for some time, it hasbeen observed that the large majority of theclients entering in the mall around lunch time(11 AM - 3 PM) was composed by Asian-American people.

10/21/15, 11:27 PMNowhere To Hide: FBI To Roll-Out $1 Billion Facial Recognition System

Page 1 of 3http://newrisingmedia.com/all/2012/9/10/nowhere-to-hide-fbi-to-roll-out-1-billion-facial-recognition.html

Nowhere To Hide: FBI To Roll-Out $1 BillionFacial Recognition System

September 10, 2012

By: Jason England

Tags: CCTV, DNA,

FBI, Facial

Recognition, Minority

Report, NGI, Next

Generation

Identification, Pre-

Crime, Camera,

Surveillance

Category: Tech

Comment

The FBI has revealed its plans to launch a $1 billion Minority Report-like facial recognition system across the United States sounprecedented in scope that it will be able to be used to identify criminals with greater than 90 percent accuracy.

Promising to be an upheaval of the national fingerprint database, it’s all part of the billion dollar Next Generation Identification (NGI) programmethat is set to add mugshots of citizens to the database, as well as biometric data such as DNA information, iris scans and voice recognition data toexpand the tool-set available to authorities.

According to New Scientist, a ‘handful’ of states have already started to upload their photos,presumably of known criminals, to aid the pilot programme. The facial recognition system works inone of two ways. Law enforcement agencies can either take an image of a person of interest andcompare that against the national repository of images held by the FBI to produce a list of potential'hits', fairly simple. Or, and this is the most fascinating application, footage from security cameras canpotentially be fed, in real-time, to the software whereby a mark can be picked out in a crowd andfollowed through the streets. Remind you of anything? Perhaps you remember the scene in MinorityReport where surveillance cameras alert the authorities of Tom Cruise's whereabouts after catching a

glimpse of his face?

Don't expect for a moment the roll-out of such a powerful system will happen without a hitch, however. The FBI surely has a duty to US citizens toexplain how such power will be implemented without being abused, nor one that will be seen to encroach on our most basic human rights. So far,the bureau has been pilot-testing the system since February of this year, and pins down 2014 as a possible time-frame to roll out the programmenationwide.

Richard Birkett

Home Technology Social Media Science

Movies Music Games

New Rising

Media

New Rising Media is the definitive source of technology, social media, science,

movies, music and games news in the UK.

More

The company owning the mall implemented two simple actions:revised the shifts of the employees so as that (most of) theAsian-American ones were on duty in that time window;hired new Asian-American employees.

The overall effect has been a huge increase in sales.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 3 / 24

Page 4: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 1: automatic data collection (aka nowhere to hide)

A face recognition system has been put inplace in a mall somewhere in the US.

Main purpose of the system was security.

10/21/15, 11:25 PMNowhere To Hide: FBI To Roll-Out $1 Billion Facial Recognition System

Page 1 of 3http://newrisingmedia.com/all/2012/9/10/nowhere-to-hide-fbi-to-roll-out-1-billion-facial-recognition.html

Nowhere To Hide: FBI To Roll-Out $1 BillionFacial Recognition System

September 10, 2012

By: Jason England

Tags: CCTV, DNA,

FBI, Facial

Recognition, Minority

Report, NGI, Next

Generation

Identification, Pre-

Crime, Camera,

Surveillance

Category: Tech

Comment

The FBI has revealed its plans to launch a $1 billion Minority Report-like facial recognition system across the United States sounprecedented in scope that it will be able to be used to identify criminals with greater than 90 percent accuracy.

Promising to be an upheaval of the national fingerprint database, it’s all part of the billion dollar Next Generation Identification (NGI) programmethat is set to add mugshots of citizens to the database, as well as biometric data such as DNA information, iris scans and voice recognition data toexpand the tool-set available to authorities.

According to New Scientist, a ‘handful’ of states have already started to upload their photos,presumably of known criminals, to aid the pilot programme. The facial recognition system works inone of two ways. Law enforcement agencies can either take an image of a person of interest andcompare that against the national repository of images held by the FBI to produce a list of potential'hits', fairly simple. Or, and this is the most fascinating application, footage from security cameras canpotentially be fed, in real-time, to the software whereby a mark can be picked out in a crowd andfollowed through the streets. Remind you of anything? Perhaps you remember the scene in MinorityReport where surveillance cameras alert the authorities of Tom Cruise's whereabouts after catching a

glimpse of his face?

Don't expect for a moment the roll-out of such a powerful system will happen without a hitch, however. The FBI surely has a duty to US citizens toexplain how such power will be implemented without being abused, nor one that will be seen to encroach on our most basic human rights. So far,the bureau has been pilot-testing the system since February of this year, and pins down 2014 as a possible time-frame to roll out the programmenationwide.

Richard Birkett

Home Technology Social Media Science

Movies Music Games

New Rising

Media

New Rising Media is the definitive source of technology, social media, science,

movies, music and games news in the UK.

More

After collecting data for some time, it hasbeen observed that the large majority of theclients entering in the mall around lunch time(11 AM - 3 PM) was composed by Asian-American people.

10/21/15, 11:27 PMNowhere To Hide: FBI To Roll-Out $1 Billion Facial Recognition System

Page 1 of 3http://newrisingmedia.com/all/2012/9/10/nowhere-to-hide-fbi-to-roll-out-1-billion-facial-recognition.html

Nowhere To Hide: FBI To Roll-Out $1 BillionFacial Recognition System

September 10, 2012

By: Jason England

Tags: CCTV, DNA,

FBI, Facial

Recognition, Minority

Report, NGI, Next

Generation

Identification, Pre-

Crime, Camera,

Surveillance

Category: Tech

Comment

The FBI has revealed its plans to launch a $1 billion Minority Report-like facial recognition system across the United States sounprecedented in scope that it will be able to be used to identify criminals with greater than 90 percent accuracy.

Promising to be an upheaval of the national fingerprint database, it’s all part of the billion dollar Next Generation Identification (NGI) programmethat is set to add mugshots of citizens to the database, as well as biometric data such as DNA information, iris scans and voice recognition data toexpand the tool-set available to authorities.

According to New Scientist, a ‘handful’ of states have already started to upload their photos,presumably of known criminals, to aid the pilot programme. The facial recognition system works inone of two ways. Law enforcement agencies can either take an image of a person of interest andcompare that against the national repository of images held by the FBI to produce a list of potential'hits', fairly simple. Or, and this is the most fascinating application, footage from security cameras canpotentially be fed, in real-time, to the software whereby a mark can be picked out in a crowd andfollowed through the streets. Remind you of anything? Perhaps you remember the scene in MinorityReport where surveillance cameras alert the authorities of Tom Cruise's whereabouts after catching a

glimpse of his face?

Don't expect for a moment the roll-out of such a powerful system will happen without a hitch, however. The FBI surely has a duty to US citizens toexplain how such power will be implemented without being abused, nor one that will be seen to encroach on our most basic human rights. So far,the bureau has been pilot-testing the system since February of this year, and pins down 2014 as a possible time-frame to roll out the programmenationwide.

Richard Birkett

Home Technology Social Media Science

Movies Music Games

New Rising

Media

New Rising Media is the definitive source of technology, social media, science,

movies, music and games news in the UK.

More

The company owning the mall implemented two simple actions:revised the shifts of the employees so as that (most of) theAsian-American ones were on duty in that time window;hired new Asian-American employees.

The overall effect has been a huge increase in sales.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 3 / 24

Page 5: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 1: automatic data collection (aka nowhere to hide)

A face recognition system has been put inplace in a mall somewhere in the US.

Main purpose of the system was security.

10/21/15, 11:25 PMNowhere To Hide: FBI To Roll-Out $1 Billion Facial Recognition System

Page 1 of 3http://newrisingmedia.com/all/2012/9/10/nowhere-to-hide-fbi-to-roll-out-1-billion-facial-recognition.html

Nowhere To Hide: FBI To Roll-Out $1 BillionFacial Recognition System

September 10, 2012

By: Jason England

Tags: CCTV, DNA,

FBI, Facial

Recognition, Minority

Report, NGI, Next

Generation

Identification, Pre-

Crime, Camera,

Surveillance

Category: Tech

Comment

The FBI has revealed its plans to launch a $1 billion Minority Report-like facial recognition system across the United States sounprecedented in scope that it will be able to be used to identify criminals with greater than 90 percent accuracy.

Promising to be an upheaval of the national fingerprint database, it’s all part of the billion dollar Next Generation Identification (NGI) programmethat is set to add mugshots of citizens to the database, as well as biometric data such as DNA information, iris scans and voice recognition data toexpand the tool-set available to authorities.

According to New Scientist, a ‘handful’ of states have already started to upload their photos,presumably of known criminals, to aid the pilot programme. The facial recognition system works inone of two ways. Law enforcement agencies can either take an image of a person of interest andcompare that against the national repository of images held by the FBI to produce a list of potential'hits', fairly simple. Or, and this is the most fascinating application, footage from security cameras canpotentially be fed, in real-time, to the software whereby a mark can be picked out in a crowd andfollowed through the streets. Remind you of anything? Perhaps you remember the scene in MinorityReport where surveillance cameras alert the authorities of Tom Cruise's whereabouts after catching a

glimpse of his face?

Don't expect for a moment the roll-out of such a powerful system will happen without a hitch, however. The FBI surely has a duty to US citizens toexplain how such power will be implemented without being abused, nor one that will be seen to encroach on our most basic human rights. So far,the bureau has been pilot-testing the system since February of this year, and pins down 2014 as a possible time-frame to roll out the programmenationwide.

Richard Birkett

Home Technology Social Media Science

Movies Music Games

New Rising

Media

New Rising Media is the definitive source of technology, social media, science,

movies, music and games news in the UK.

More

After collecting data for some time, it hasbeen observed that the large majority of theclients entering in the mall around lunch time(11 AM - 3 PM) was composed by Asian-American people.

10/21/15, 11:27 PMNowhere To Hide: FBI To Roll-Out $1 Billion Facial Recognition System

Page 1 of 3http://newrisingmedia.com/all/2012/9/10/nowhere-to-hide-fbi-to-roll-out-1-billion-facial-recognition.html

Nowhere To Hide: FBI To Roll-Out $1 BillionFacial Recognition System

September 10, 2012

By: Jason England

Tags: CCTV, DNA,

FBI, Facial

Recognition, Minority

Report, NGI, Next

Generation

Identification, Pre-

Crime, Camera,

Surveillance

Category: Tech

Comment

The FBI has revealed its plans to launch a $1 billion Minority Report-like facial recognition system across the United States sounprecedented in scope that it will be able to be used to identify criminals with greater than 90 percent accuracy.

Promising to be an upheaval of the national fingerprint database, it’s all part of the billion dollar Next Generation Identification (NGI) programmethat is set to add mugshots of citizens to the database, as well as biometric data such as DNA information, iris scans and voice recognition data toexpand the tool-set available to authorities.

According to New Scientist, a ‘handful’ of states have already started to upload their photos,presumably of known criminals, to aid the pilot programme. The facial recognition system works inone of two ways. Law enforcement agencies can either take an image of a person of interest andcompare that against the national repository of images held by the FBI to produce a list of potential'hits', fairly simple. Or, and this is the most fascinating application, footage from security cameras canpotentially be fed, in real-time, to the software whereby a mark can be picked out in a crowd andfollowed through the streets. Remind you of anything? Perhaps you remember the scene in MinorityReport where surveillance cameras alert the authorities of Tom Cruise's whereabouts after catching a

glimpse of his face?

Don't expect for a moment the roll-out of such a powerful system will happen without a hitch, however. The FBI surely has a duty to US citizens toexplain how such power will be implemented without being abused, nor one that will be seen to encroach on our most basic human rights. So far,the bureau has been pilot-testing the system since February of this year, and pins down 2014 as a possible time-frame to roll out the programmenationwide.

Richard Birkett

Home Technology Social Media Science

Movies Music Games

New Rising

Media

New Rising Media is the definitive source of technology, social media, science,

movies, music and games news in the UK.

More

The company owning the mall implemented two simple actions:revised the shifts of the employees so as that (most of) theAsian-American ones were on duty in that time window;hired new Asian-American employees.

The overall effect has been a huge increase in sales.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 3 / 24

Page 6: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 1: automatic data collection (aka nowhere to hide)

A face recognition system has been put inplace in a mall somewhere in the US.

Main purpose of the system was security.

10/21/15, 11:25 PMNowhere To Hide: FBI To Roll-Out $1 Billion Facial Recognition System

Page 1 of 3http://newrisingmedia.com/all/2012/9/10/nowhere-to-hide-fbi-to-roll-out-1-billion-facial-recognition.html

Nowhere To Hide: FBI To Roll-Out $1 BillionFacial Recognition System

September 10, 2012

By: Jason England

Tags: CCTV, DNA,

FBI, Facial

Recognition, Minority

Report, NGI, Next

Generation

Identification, Pre-

Crime, Camera,

Surveillance

Category: Tech

Comment

The FBI has revealed its plans to launch a $1 billion Minority Report-like facial recognition system across the United States sounprecedented in scope that it will be able to be used to identify criminals with greater than 90 percent accuracy.

Promising to be an upheaval of the national fingerprint database, it’s all part of the billion dollar Next Generation Identification (NGI) programmethat is set to add mugshots of citizens to the database, as well as biometric data such as DNA information, iris scans and voice recognition data toexpand the tool-set available to authorities.

According to New Scientist, a ‘handful’ of states have already started to upload their photos,presumably of known criminals, to aid the pilot programme. The facial recognition system works inone of two ways. Law enforcement agencies can either take an image of a person of interest andcompare that against the national repository of images held by the FBI to produce a list of potential'hits', fairly simple. Or, and this is the most fascinating application, footage from security cameras canpotentially be fed, in real-time, to the software whereby a mark can be picked out in a crowd andfollowed through the streets. Remind you of anything? Perhaps you remember the scene in MinorityReport where surveillance cameras alert the authorities of Tom Cruise's whereabouts after catching a

glimpse of his face?

Don't expect for a moment the roll-out of such a powerful system will happen without a hitch, however. The FBI surely has a duty to US citizens toexplain how such power will be implemented without being abused, nor one that will be seen to encroach on our most basic human rights. So far,the bureau has been pilot-testing the system since February of this year, and pins down 2014 as a possible time-frame to roll out the programmenationwide.

Richard Birkett

Home Technology Social Media Science

Movies Music Games

New Rising

Media

New Rising Media is the definitive source of technology, social media, science,

movies, music and games news in the UK.

More

After collecting data for some time, it hasbeen observed that the large majority of theclients entering in the mall around lunch time(11 AM - 3 PM) was composed by Asian-American people.

10/21/15, 11:27 PMNowhere To Hide: FBI To Roll-Out $1 Billion Facial Recognition System

Page 1 of 3http://newrisingmedia.com/all/2012/9/10/nowhere-to-hide-fbi-to-roll-out-1-billion-facial-recognition.html

Nowhere To Hide: FBI To Roll-Out $1 BillionFacial Recognition System

September 10, 2012

By: Jason England

Tags: CCTV, DNA,

FBI, Facial

Recognition, Minority

Report, NGI, Next

Generation

Identification, Pre-

Crime, Camera,

Surveillance

Category: Tech

Comment

The FBI has revealed its plans to launch a $1 billion Minority Report-like facial recognition system across the United States sounprecedented in scope that it will be able to be used to identify criminals with greater than 90 percent accuracy.

Promising to be an upheaval of the national fingerprint database, it’s all part of the billion dollar Next Generation Identification (NGI) programmethat is set to add mugshots of citizens to the database, as well as biometric data such as DNA information, iris scans and voice recognition data toexpand the tool-set available to authorities.

According to New Scientist, a ‘handful’ of states have already started to upload their photos,presumably of known criminals, to aid the pilot programme. The facial recognition system works inone of two ways. Law enforcement agencies can either take an image of a person of interest andcompare that against the national repository of images held by the FBI to produce a list of potential'hits', fairly simple. Or, and this is the most fascinating application, footage from security cameras canpotentially be fed, in real-time, to the software whereby a mark can be picked out in a crowd andfollowed through the streets. Remind you of anything? Perhaps you remember the scene in MinorityReport where surveillance cameras alert the authorities of Tom Cruise's whereabouts after catching a

glimpse of his face?

Don't expect for a moment the roll-out of such a powerful system will happen without a hitch, however. The FBI surely has a duty to US citizens toexplain how such power will be implemented without being abused, nor one that will be seen to encroach on our most basic human rights. So far,the bureau has been pilot-testing the system since February of this year, and pins down 2014 as a possible time-frame to roll out the programmenationwide.

Richard Birkett

Home Technology Social Media Science

Movies Music Games

New Rising

Media

New Rising Media is the definitive source of technology, social media, science,

movies, music and games news in the UK.

More

The company owning the mall implemented two simple actions:revised the shifts of the employees so as that (most of) theAsian-American ones were on duty in that time window;hired new Asian-American employees.

The overall effect has been a huge increase in sales.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 3 / 24

Page 7: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 2: integrated decision support

20#

Promo(ons#Execu(on#Integrated#Real7(me#Decision#Support#

FRI#7#5PM# FRI#7#10PM#

0#

100#

200#

300#

400#

500#

SAT#7#2PM#

FORECAST)

INVENTORY)

Based#on#the#forecast,#300#blouses#are#sent#for#the#promo(on#that#starts#at#5PM#Friday.#

MONDAY#

Copyright#©#2014#JDA#SoQware#Group,#Inc.#Confiden(al#

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 4 / 24

Page 8: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 2: integrated decision support

21#

0#

100#

200#

300#

400#

0#

100#

200#

300#

400#

Promo(ons#Execu(on#Integrated#Real7(me#Decision#Support#

FRI#7#5PM# FRI#7#10PM#

0#

100#

200#

300#

400#

500#

SAT#7#2PM#

FORECAST)

INVENTORY)

Based#on#the#forecast,#300#blouses#are#sent#for#the#promo(on#that#starts#at#5PM#Friday.#

MONDAY#

At#this#pace#blouses#will#be#sold)out)by#2PM#on#Saturday#

An#intra7day#pace7based#forecas(ng#engine#detects#a#poten7al)stock)out#situa(on#and#generates#an#alert)

REAL#TIME#INVENTORY##STATUS#

DEMAND#PREDICITION#

ADJUSTED#By#10PM,#180#blouses#have#sold#

! Copyright#©#2014#JDA#SoQware#Group,#Inc.#Confiden(al#

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 5 / 24

Page 9: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 2: integrated decision support

22#

0#

100#

200#

300#

400#

0#

100#

200#

300#

400#

Promo(ons#Execu(on#Integrated#Real7(me#Decision#Support#

FRI#7#5PM# FRI#7#10PM#

0#

100#

200#

300#

400#

500#

SAT#7#2PM#

FORECAST)

INVENTORY)

Based#on#the#forecast,#300#blouses#are#sent#for#the#promo(on#that#starts#at#5PM#Friday.#

MONDAY#

At#this#pace#blouses#will#be#sold)out)by#2PM#on#Saturday#

An#intra7day#pace7based#forecas(ng#engine#detects#a#poten7al)stock)out#situa(on#and#generates#an#alert)

REAL#TIME#INVENTORY##STATUS#

DEMAND#PREDICITION#

ADJUSTED#By#10PM,#180#blouses#have#sold#

!

Retailer#has#a#rela(vely#nimble#supply#chain.##The#system#generates#an#order#at#the#DC#to#be#put#on#the#regular#0900#shipment.##Shelves#are#full#and#customers#are#happy.##Revenues#are#robust#and#promo(onal#efficiency#is#high#

Copyright#©#2014#JDA#SoQware#Group,#Inc.#Confiden(al#

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 6 / 24

Page 10: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 2: integrated decision support

23#

0#

100#

200#

300#

400#

0#

100#

200#

300#

400#

Promo(ons#Execu(on#Integrated#Real7(me#Decision#Support#

FRI#7#5PM# FRI#7#10PM#

0#

100#

200#

300#

400#

500#

SAT#7#2PM#

FORECAST)

INVENTORY)

Based#on#the#forecast,#300#blouses#are#sent#for#the#promo(on#that#starts#at#5PM#Friday.#

MONDAY#

At#this#pace#blouses#will#be#sold)out)by#2PM#on#Saturday#

An#intra7day#pace7based#forecas(ng#engine#detects#a#poten7al)stock)out#situa(on#and#generates#an#alert)

But#wait…the#weather#forecast#for#Saturday#is#terrible,#and#the#likely#surge#in#sales#on#Friday#is#a#reflec(on#of#that#

REAL#TIME#INVENTORY##STATUS#

DEMAND#PREDICITION#

ADJUSTED#

CAUSAL#FACTOR#1#

By#10PM,#180#blouses#have#sold#

!

Retailer#has#a#rela(vely#nimble#supply#chain.##The#system#generates#an#order#at#the#DC#to#be#put#on#the#regular#0900#shipment.##Shelves#are#full#and#customers#are#happy.##Revenues#are#robust#and#promo(onal#efficiency#is#high#

Copyright#©#2014#JDA#SoQware#Group,#Inc.#Confiden(al#

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 7 / 24

Page 11: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 2: integrated decision support

24#

0#

100#

200#

300#

400#

0#

100#

200#

300#

400#

Promo(ons#Execu(on#Integrated#Real7(me#Decision#Support#

FRI#7#5PM# FRI#7#10PM#

0#

100#

200#

300#

400#

500#

SAT#7#2PM#

FORECAST)

INVENTORY)

Based#on#the#forecast,#300#blouses#are#sent#for#the#promo(on#that#starts#at#5PM#Friday.#

MONDAY#

At#this#pace#blouses#will#be#sold)out)by#2PM#on#Saturday#

An#intra7day#pace7based#forecas(ng#engine#detects#a#poten7al)stock)out#situa(on#and#generates#an#alert)

But#wait…the#weather#forecast#for#Saturday#is#terrible,#and#the#likely#surge#in#sales#on#Friday#is#a#reflec(on#of#that#

Shoppers#have#been#“twee7ng”#about#this#deal#and#that#has#generated#a#buzz#and#an(cipated#traffic#and#sales#

REAL#TIME#INVENTORY##STATUS#

DEMAND#PREDICITION#

ADJUSTED#

CAUSAL#FACTOR#1#

CAUSAL#FACTOR#2#

By#10PM,#180#blouses#have#sold#

!

Retailer#has#a#rela(vely#nimble#supply#chain.##The#system#generates#an#order#at#the#DC#to#be#put#on#the#regular#0900#shipment.##Shelves#are#full#and#customers#are#happy.##Revenues#are#robust#and#promo(onal#efficiency#is#high#

Copyright#©#2014#JDA#SoQware#Group,#Inc.#Confiden(al#

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 8 / 24

Page 12: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 2: integrated decision support

25#

0#

100#

200#

300#

400#

0#

100#

200#

300#

400#

FRI#7#5PM# FRI#7#10PM#

0#

100#

200#

300#

400#

500#

SAT#7#2PM#

FORECAST)

INVENTORY)

Based#on#the#forecast,#300#blouses#are#sent#for#the#promo(on#that#starts#at#5PM#Friday.#

MONDAY#

At#this#pace#blouses#will#be#sold)out)by#2PM#on#Saturday#

But#wait…the#weather#forecast#for#Saturday#is#terrible,#and#the#likely#surge#in#sales#on#Friday#is#a#reflec(on#of#that#

Shoppers#have#been#“twee7ng”#about#this#deal#and#that#has#generated#a#buzz#and#an(cipated#traffic#and#sales#

REAL#TIME#INVENTORY##STATUS#

DEMAND#PREDICITION#

ADJUSTED#

CAUSAL#FACTOR#1#

CAUSAL#FACTOR#2#

By#10PM,#180#blouses#have#sold#

An#intra7day#pace7based#forecas(ng#engine#detects#a#poten7al)stock)out#situa(on#and#generates#an#alert) !

Promo(ons#Execu(on#Integrated#Real7(me#Decision#Support#

Real7(me#demand7sensing#allows##retailers#to#improve#the#execu(on#of#their#promo(ons#and#to#op(mize#future#promo(onal#plans#

Copyright#©#2014#JDA#SoQware#Group,#Inc.#Confiden(al#

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 9 / 24

Page 13: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

What I like of Big Data

The first example shows that automatic collection of data can lead to thedefinition of new (optimization) problems.

Disseminating sensors (including mobile devices) everywhere has becomecheap (and cool!) but the real challenge is taking decisions over the collected(complex) data.

It is not completely clear if the (applied) optimization problems we were usedto solve in contexts as diverse as routing, supply chain and logistics, energy,telecommunications, etc. are still there or, instead, have radically changed.

The spirit of such a change is shown by the second example: the end-users“behavior” is putting more and more pressure on the decision makers and, bytransitivity, on the optimizers. This is not true only in the retail industry butvirtually in any other in which a service is delivered:

routing, I can check with my mobile device where cabs/buses are located;traffic management, I am aware of congestions, accidents, etc. in the city;cache allocation for video streaming, complaints escalate in real time.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 10 / 24

Page 14: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

What I like of Big Data

The first example shows that automatic collection of data can lead to thedefinition of new (optimization) problems.

Disseminating sensors (including mobile devices) everywhere has becomecheap (and cool!) but the real challenge is taking decisions over the collected(complex) data.

It is not completely clear if the (applied) optimization problems we were usedto solve in contexts as diverse as routing, supply chain and logistics, energy,telecommunications, etc. are still there or, instead, have radically changed.

The spirit of such a change is shown by the second example: the end-users“behavior” is putting more and more pressure on the decision makers and, bytransitivity, on the optimizers. This is not true only in the retail industry butvirtually in any other in which a service is delivered:

routing, I can check with my mobile device where cabs/buses are located;traffic management, I am aware of congestions, accidents, etc. in the city;cache allocation for video streaming, complaints escalate in real time.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 10 / 24

Page 15: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

What I like of Big Data

The first example shows that automatic collection of data can lead to thedefinition of new (optimization) problems.

Disseminating sensors (including mobile devices) everywhere has becomecheap (and cool!) but the real challenge is taking decisions over the collected(complex) data.

It is not completely clear if the (applied) optimization problems we were usedto solve in contexts as diverse as routing, supply chain and logistics, energy,telecommunications, etc. are still there or, instead, have radically changed.

The spirit of such a change is shown by the second example: the end-users“behavior” is putting more and more pressure on the decision makers and, bytransitivity, on the optimizers.

This is not true only in the retail industry butvirtually in any other in which a service is delivered:

routing, I can check with my mobile device where cabs/buses are located;traffic management, I am aware of congestions, accidents, etc. in the city;cache allocation for video streaming, complaints escalate in real time.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 10 / 24

Page 16: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

What I like of Big Data

The first example shows that automatic collection of data can lead to thedefinition of new (optimization) problems.

Disseminating sensors (including mobile devices) everywhere has becomecheap (and cool!) but the real challenge is taking decisions over the collected(complex) data.

It is not completely clear if the (applied) optimization problems we were usedto solve in contexts as diverse as routing, supply chain and logistics, energy,telecommunications, etc. are still there or, instead, have radically changed.

The spirit of such a change is shown by the second example: the end-users“behavior” is putting more and more pressure on the decision makers and, bytransitivity, on the optimizers. This is not true only in the retail industry butvirtually in any other in which a service is delivered:

routing, I can check with my mobile device where cabs/buses are located;traffic management, I am aware of congestions, accidents, etc. in the city;cache allocation for video streaming, complaints escalate in real time.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 10 / 24

Page 17: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

What I like of Big Data (cont.d)The most significant effect of considering the end-users behavior is thatcomplex systems that have been traditionally split into smaller parts,optimized sequentially, now need to be tackled in an integrated fashion.

Splitting was happening because of1 difficulty and cost of collecting reliable data for the entire system2 the size of the decision problems associated with considering the entire

system would have been too large3 there was very little perception both among the industrial players and

among the end-users that splitting was avoidable.

Lack of technological communication:the different divisions of, say, a firm, had little data exchange, andthe end-user had no mobile technology to be updated in real time.

Mobile technology has urged the request of integrated approaches fordecision making because of the perception of missing opportunities.

I believe this is true in energy as well because smart meters and smartbuildings (producing energy as well as consuming it) are increasingend-users’ awareness and pushing for more (integrated) optimization.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 11 / 24

Page 18: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

What I like of Big Data (cont.d)The most significant effect of considering the end-users behavior is thatcomplex systems that have been traditionally split into smaller parts,optimized sequentially, now need to be tackled in an integrated fashion.Splitting was happening because of

1 difficulty and cost of collecting reliable data for the entire system2 the size of the decision problems associated with considering the entire

system would have been too large3 there was very little perception both among the industrial players and

among the end-users that splitting was avoidable.

Lack of technological communication:the different divisions of, say, a firm, had little data exchange, andthe end-user had no mobile technology to be updated in real time.

Mobile technology has urged the request of integrated approaches fordecision making because of the perception of missing opportunities.

I believe this is true in energy as well because smart meters and smartbuildings (producing energy as well as consuming it) are increasingend-users’ awareness and pushing for more (integrated) optimization.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 11 / 24

Page 19: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

What I like of Big Data (cont.d)The most significant effect of considering the end-users behavior is thatcomplex systems that have been traditionally split into smaller parts,optimized sequentially, now need to be tackled in an integrated fashion.Splitting was happening because of

1 difficulty and cost of collecting reliable data for the entire system2 the size of the decision problems associated with considering the entire

system would have been too large3 there was very little perception both among the industrial players and

among the end-users that splitting was avoidable.

Lack of technological communication:the different divisions of, say, a firm, had little data exchange, andthe end-user had no mobile technology to be updated in real time.

Mobile technology has urged the request of integrated approaches fordecision making because of the perception of missing opportunities.

I believe this is true in energy as well because smart meters and smartbuildings (producing energy as well as consuming it) are increasingend-users’ awareness and pushing for more (integrated) optimization.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 11 / 24

Page 20: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

What I like of Big Data (cont.d)The most significant effect of considering the end-users behavior is thatcomplex systems that have been traditionally split into smaller parts,optimized sequentially, now need to be tackled in an integrated fashion.Splitting was happening because of

1 difficulty and cost of collecting reliable data for the entire system2 the size of the decision problems associated with considering the entire

system would have been too large3 there was very little perception both among the industrial players and

among the end-users that splitting was avoidable.

Lack of technological communication:the different divisions of, say, a firm, had little data exchange, andthe end-user had no mobile technology to be updated in real time.

Mobile technology has urged the request of integrated approaches fordecision making because of the perception of missing opportunities.

I believe this is true in energy as well because smart meters and smartbuildings (producing energy as well as consuming it) are increasingend-users’ awareness and pushing for more (integrated) optimization.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 11 / 24

Page 21: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

What I like of Big Data (cont.d)The most significant effect of considering the end-users behavior is thatcomplex systems that have been traditionally split into smaller parts,optimized sequentially, now need to be tackled in an integrated fashion.Splitting was happening because of

1 difficulty and cost of collecting reliable data for the entire system2 the size of the decision problems associated with considering the entire

system would have been too large3 there was very little perception both among the industrial players and

among the end-users that splitting was avoidable.

Lack of technological communication:the different divisions of, say, a firm, had little data exchange, andthe end-user had no mobile technology to be updated in real time.

Mobile technology has urged the request of integrated approaches fordecision making because of the perception of missing opportunities.

I believe this is true in energy as well because smart meters and smartbuildings (producing energy as well as consuming it) are increasingend-users’ awareness and pushing for more (integrated) optimization.Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 11 / 24

Page 22: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Integrated models: (the dream of) big data in retail

6  

FINANCIAL PLANNING

MARKDOWN SIMULATION

LIFECYCLE & PRICING

OTHER LEVERS

ITEM RECOMMENDATION & RANGING

OMNI-CHANNEL DEPTH CALCULATION

BUYING PROCESS

PREPACK

SIZE SCALING

ITEM GENERATION

SPACE ALLOCATION

CONTI-NUOUS LEARNI

NG

ALLOCATION & REPLENISHMENT

PLAN ASSORT BUY REACT

PLANNING   CONTINUOUS  PLANNING   REAL  TIME  

ATTRIBUTE BASED

SCORING

DEMAND TRANSFERA-

BILITY DATA

DRIVEN BI

CLICK- STREAM

ANALYSIS

SOCIAL MEDIA

ANALYSIS

TOOLS

CLUSTERING

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 12 / 24

Page 23: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

The role of learning

From an optimization perspective, formulating and solving those integratedmodels is, of course, hard.

This is because of1 volume2 velocity3 variety

of the data, and also because optimizers are not – in general – trained for that.

One answer to this is introducing into the picture some learning mechanismsthat allow to treat data, often reducing their volume and variety, and to takeinto account the end-user perspective/behavior.

In the retail context, one needs to predicts the sales of a certain product, on acertain shop location, in a certain season, to a certain segment of shoppers.

Learning from historical data allows to compute a score associated with thesechoices and the optimization problem associated with the assortment can besolved only after these scores are computed.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 13 / 24

Page 24: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

The role of learning

From an optimization perspective, formulating and solving those integratedmodels is, of course, hard.

This is because of1 volume2 velocity3 variety

of the data, and also because optimizers are not – in general – trained for that.

One answer to this is introducing into the picture some learning mechanismsthat allow to treat data, often reducing their volume and variety, and to takeinto account the end-user perspective/behavior.

In the retail context, one needs to predicts the sales of a certain product, on acertain shop location, in a certain season, to a certain segment of shoppers.

Learning from historical data allows to compute a score associated with thesechoices and the optimization problem associated with the assortment can besolved only after these scores are computed.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 13 / 24

Page 25: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

The role of learning

From an optimization perspective, formulating and solving those integratedmodels is, of course, hard.

This is because of1 volume2 velocity3 variety

of the data, and also because optimizers are not – in general – trained for that.

One answer to this is introducing into the picture some learning mechanismsthat allow to treat data, often reducing their volume and variety, and to takeinto account the end-user perspective/behavior.

In the retail context, one needs to predicts the sales of a certain product, on acertain shop location, in a certain season, to a certain segment of shoppers.

Learning from historical data allows to compute a score associated with thesechoices and the optimization problem associated with the assortment can besolved only after these scores are computed.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 13 / 24

Page 26: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

The role of learning

From an optimization perspective, formulating and solving those integratedmodels is, of course, hard.

This is because of1 volume2 velocity3 variety

of the data, and also because optimizers are not – in general – trained for that.

One answer to this is introducing into the picture some learning mechanismsthat allow to treat data, often reducing their volume and variety, and to takeinto account the end-user perspective/behavior.

In the retail context, one needs to predicts the sales of a certain product, on acertain shop location, in a certain season, to a certain segment of shoppers.

Learning from historical data allows to compute a score associated with thesechoices and the optimization problem associated with the assortment can besolved only after these scores are computed.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 13 / 24

Page 27: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 3: taking into account the end-user

33  

Shopper  Segmenta.on  

-4

-2

0 2 4

-4 -3 -2 -1 0 1 2 3 4

-4 -3 -2 -1 0 1 2 3 4

> Segments  are  created  based  on  behaviors  and  preferences  that  bring  value  to  the  business  

 > These  variables  must  reveal  opportuni.es  for  ac.on,  to  be  able  to  bring  segmenta.on  to  tangible  outcomes.  

FEATURES  ENGINEERING  

CLUSTERING  ALGORITHM  

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 14 / 24

Page 28: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 3: taking into account the end-user (cont.d)

51  

ITEMS  ATTRIBUTES  VALUES  

SEASONS  ATTRIBUTES  VALUES  

LOCATIONS  ATTRIBUTES  VALUES  

A2ribute  Based  Forecas?ng  

SHOPPERS    SEGMENTS  

Brand SuperClean Fragrance Fruits Price  Band Good

Size   Small

Sales  in  Store  A,  for  Segment  A,  in  2015 ?

Never  seen  product  

User  judgment  

Linear  regressions  on  a2ributes  

Neural  networks  

Random  forests  

Computerized  adapta?ve  tes?ng  

Support  vector  machine  

…  

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 15 / 24

Page 29: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Machine Learning

Generally speaking, Machine Learning is a collection of techniques forlearning patterns in orunderstanding the structure of data,

often with the aim of performing data mining, i.e., recovering previouslyunknown, actionable information from the learnt data.

Typically, in ML one has to “learn” from data (points in the so-called trainingset) a (nonlinear) function that predicts a certain score for new data points thatare not in the training set.

Each data point is represented by a set of features, which define itscharacteristics, and whose patterns should be learnt.

The techniques used in ML are diverse, going from artificial neural networks,to first order methods like gradient descent, to convex optimization, etc.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 16 / 24

Page 30: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Machine Learning

Generally speaking, Machine Learning is a collection of techniques forlearning patterns in orunderstanding the structure of data,

often with the aim of performing data mining, i.e., recovering previouslyunknown, actionable information from the learnt data.

Typically, in ML one has to “learn” from data (points in the so-called trainingset) a (nonlinear) function that predicts a certain score for new data points thatare not in the training set.

Each data point is represented by a set of features, which define itscharacteristics, and whose patterns should be learnt.

The techniques used in ML are diverse, going from artificial neural networks,to first order methods like gradient descent, to convex optimization, etc.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 16 / 24

Page 31: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Machine Learning

Generally speaking, Machine Learning is a collection of techniques forlearning patterns in orunderstanding the structure of data,

often with the aim of performing data mining, i.e., recovering previouslyunknown, actionable information from the learnt data.

Typically, in ML one has to “learn” from data (points in the so-called trainingset) a (nonlinear) function that predicts a certain score for new data points thatare not in the training set.

Each data point is represented by a set of features, which define itscharacteristics, and whose patterns should be learnt.

The techniques used in ML are diverse, going from artificial neural networks,to first order methods like gradient descent, to convex optimization, etc.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 16 / 24

Page 32: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Machine Learning

Generally speaking, Machine Learning is a collection of techniques forlearning patterns in orunderstanding the structure of data,

often with the aim of performing data mining, i.e., recovering previouslyunknown, actionable information from the learnt data.

Typically, in ML one has to “learn” from data (points in the so-called trainingset) a (nonlinear) function that predicts a certain score for new data points thatare not in the training set.

Each data point is represented by a set of features, which define itscharacteristics, and whose patterns should be learnt.

The techniques used in ML are diverse, going from artificial neural networks,to first order methods like gradient descent, to convex optimization, etc.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 16 / 24

Page 33: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Machine Learning in retail

69  

Learning  Process

69  

Historical  data

Model  Training

0.7    0.9    

0.77    

0.65  …        

Trained  Model

Score

Compute  the  

w  and  v w  and  v  are  now  known

0.11    

0.20    

0.37    …          

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 17 / 24

Page 34: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

ML & Mathematical Optimization

I believe big data applications call for the integration between MachineLearning and Mathematical Optimization.

But, how such an integration should go?And, what about Integer Programming specifically?

Of course, the easiest integration is already shown in the examples above,where raw data are “crunched” and “prepared” by Machine Learning toconstruct the decision model on which Mathematical Optimization is applied.

However, the integration is not restricted to let ML and MP work in cascade.

Modern ML paradigms like deep learning (essentially, learning by multiplelayers) are facing more and more complicated structures in which the features(raw data observations) are not kept fixed but are “transformed” within thelearning process.

Those transformations involve highly nonconvex functions and discretedecisions.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 18 / 24

Page 35: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

ML & Mathematical Optimization

I believe big data applications call for the integration between MachineLearning and Mathematical Optimization.

But, how such an integration should go?And, what about Integer Programming specifically?

Of course, the easiest integration is already shown in the examples above,where raw data are “crunched” and “prepared” by Machine Learning toconstruct the decision model on which Mathematical Optimization is applied.

However, the integration is not restricted to let ML and MP work in cascade.

Modern ML paradigms like deep learning (essentially, learning by multiplelayers) are facing more and more complicated structures in which the features(raw data observations) are not kept fixed but are “transformed” within thelearning process.

Those transformations involve highly nonconvex functions and discretedecisions.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 18 / 24

Page 36: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

ML & Mathematical Optimization

I believe big data applications call for the integration between MachineLearning and Mathematical Optimization.

But, how such an integration should go?And, what about Integer Programming specifically?

Of course, the easiest integration is already shown in the examples above,where raw data are “crunched” and “prepared” by Machine Learning toconstruct the decision model on which Mathematical Optimization is applied.

However, the integration is not restricted to let ML and MP work in cascade.

Modern ML paradigms like deep learning (essentially, learning by multiplelayers) are facing more and more complicated structures in which the features(raw data observations) are not kept fixed but are “transformed” within thelearning process.

Those transformations involve highly nonconvex functions and discretedecisions.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 18 / 24

Page 37: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

ML & Mathematical Optimization

I believe big data applications call for the integration between MachineLearning and Mathematical Optimization.

But, how such an integration should go?And, what about Integer Programming specifically?

Of course, the easiest integration is already shown in the examples above,where raw data are “crunched” and “prepared” by Machine Learning toconstruct the decision model on which Mathematical Optimization is applied.

However, the integration is not restricted to let ML and MP work in cascade.

Modern ML paradigms like deep learning (essentially, learning by multiplelayers) are facing more and more complicated structures in which the features(raw data observations) are not kept fixed but are “transformed” within thelearning process.

Those transformations involve highly nonconvex functions and discretedecisions.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 18 / 24

Page 38: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Ex. 4: combinatorial explosion

Why are learning and optimization two faces of the same coin?

A nice example comes in healthcare.

ML could be used to predict the medical outcome that would follow from aparticular choice of combination and dosage of different drugs and treatmentsfor a patient over the course of a few months to come.

However, there could be an exponential number of such combinations toconsider, and constraints to be satisfied (for example, because of knownside-effects and resources).

Exhaustively searching in the space of such combinations is and will alwaysbe unpractical and mathematical optimization is likely to be the answer.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 19 / 24

Page 39: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

The role of discrete decisions

Discrete decisions have been disregarded so far in ML.

This is certainly due to the (negative) perception that were not affordable inpractical computation (ML has always been concerned with large volumes ofdata) but it was also related to the fact that the parameters to be learnt wereinherently continuous.

This might be less true in modern paradigms, those that led ML to contributeto the advances in computer vision, signal processing and speech recognition.

Moreover, there seems to be large room for using discrete variables toformulate nonconvexities that appear more and more to be crucial in ML.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 20 / 24

Page 40: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

The role of discrete decisions

Discrete decisions have been disregarded so far in ML.

This is certainly due to the (negative) perception that were not affordable inpractical computation (ML has always been concerned with large volumes ofdata) but it was also related to the fact that the parameters to be learnt wereinherently continuous.

This might be less true in modern paradigms, those that led ML to contributeto the advances in computer vision, signal processing and speech recognition.

Moreover, there seems to be large room for using discrete variables toformulate nonconvexities that appear more and more to be crucial in ML.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 20 / 24

Page 41: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Discrete decisions in Support Vector Machine

Ramp Loss g(ξi) = (min{ξi ,2})+

minω>ω

2+

Cn(

n∑i=1

ξi+2n∑

i=1

zi)

yi(ω>xi + b) ≥ 1− ξi−Mzi ∀i = 1, . . . ,n

0 ≤ ξi≤ 2 ∀i = 1, . . . ,nz ∈ {0,1}n

ω ∈ Rd ,b ∈ R

with M > 0 big enough constant.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 21 / 24

Page 42: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

More sophisticated nonlinear models/algorithms

I recently became aware that polynomial optimization (testing nonnegativity ofpolynomials) can potentially be used in Machine Learning.

It is likely there is room for more sophisticated ingredients in MachineLearning both on the function side (predicting functions, generally called“activation” functions) and on the algorithmic side.

In addition, the combination of nonlinear functions and discrete decisionscould make the learning mechanisms more ambitious.

This is true in our running example in retail, where currently the substitutioneffect of several products in the potential assortment is not directly taken intoaccount by ML in computing the scores.

In other words, computing scores for pairs of (substitute) products or for entireassortments (discrete sets) could lead to more sophisticated MINLPs to workwith.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 22 / 24

Page 43: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

More sophisticated nonlinear models/algorithms

I recently became aware that polynomial optimization (testing nonnegativity ofpolynomials) can potentially be used in Machine Learning.

It is likely there is room for more sophisticated ingredients in MachineLearning both on the function side (predicting functions, generally called“activation” functions) and on the algorithmic side.

In addition, the combination of nonlinear functions and discrete decisionscould make the learning mechanisms more ambitious.

This is true in our running example in retail, where currently the substitutioneffect of several products in the potential assortment is not directly taken intoaccount by ML in computing the scores.

In other words, computing scores for pairs of (substitute) products or for entireassortments (discrete sets) could lead to more sophisticated MINLPs to workwith.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 22 / 24

Page 44: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

More sophisticated nonlinear models/algorithms

I recently became aware that polynomial optimization (testing nonnegativity ofpolynomials) can potentially be used in Machine Learning.

It is likely there is room for more sophisticated ingredients in MachineLearning both on the function side (predicting functions, generally called“activation” functions) and on the algorithmic side.

In addition, the combination of nonlinear functions and discrete decisionscould make the learning mechanisms more ambitious.

This is true in our running example in retail, where currently the substitutioneffect of several products in the potential assortment is not directly taken intoaccount by ML in computing the scores.

In other words, computing scores for pairs of (substitute) products or for entireassortments (discrete sets) could lead to more sophisticated MINLPs to workwith.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 22 / 24

Page 45: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

More sophisticated nonlinear models/algorithms

I recently became aware that polynomial optimization (testing nonnegativity ofpolynomials) can potentially be used in Machine Learning.

It is likely there is room for more sophisticated ingredients in MachineLearning both on the function side (predicting functions, generally called“activation” functions) and on the algorithmic side.

In addition, the combination of nonlinear functions and discrete decisionscould make the learning mechanisms more ambitious.

This is true in our running example in retail, where currently the substitutioneffect of several products in the potential assortment is not directly taken intoaccount by ML in computing the scores.

In other words, computing scores for pairs of (substitute) products or for entireassortments (discrete sets) could lead to more sophisticated MINLPs to workwith.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 22 / 24

Page 46: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

More sophisticated nonlinear models (cont.d)

Copyright 2013 JDA Software Group, Inc. - CONFIDENTIAL

Product Portfolio Optimization

5

A B C D

A B

C D

𝑑′𝐴, 𝑑′𝐵, 𝑑′𝐶, 𝑑′𝐷

Demand with substitution

A B

C D

𝑑𝐴, 𝑑𝐵, 𝑑𝐶, 𝑑𝐷

Demand with no substitution

𝑐𝐴, 𝑐𝐵, 𝑐𝐶, 𝑐𝐷 Inferred from scores

Substitution effect for each item occurs between the pair of most similar items in the final assortment

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 23 / 24

Page 47: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Conclusions

We have discussed a few important issues arising in big data (optimization),namely

the change of perspective associated with dealing with the end-usersbehavior,the need of formulating and solving integrated models, andthe role of (machine) learning.

I am an optimistic person, so I see huge opportunities through the interactionbetween Machine Learning and Mathematical Optimization, especially on theInteger Programming side.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 24 / 24

Page 48: Big Data & Mixed-Integer (Nonlinear) Programming · 2016. 2. 1. · Big Data & Mixed-Integer (Nonlinear) Programming Andrea Lodi joint work withMarie-Claude Cotˆ e´(JDA) Canada

Conclusions

We have discussed a few important issues arising in big data (optimization),namely

the change of perspective associated with dealing with the end-usersbehavior,the need of formulating and solving integrated models, andthe role of (machine) learning.

I am an optimistic person, so I see huge opportunities through the interactionbetween Machine Learning and Mathematical Optimization, especially on theInteger Programming side.

Andrea Lodi (CERC, Polytechnique Montreal) Big Data & Integer Programming ATI 2016 24 / 24