loss function - rt-rk.uns.ac.rs · dragan samardzija wireless research laboratory bell...

12
Loss Function -Quick Notes- Dragan Samardzija January 2020 1

Upload: others

Post on 30-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Loss Function-Quick Notes-

    Dragan Samardzija

    January 2020

    1

  • References

    1. Wikipedia

    2. Data Science: Deep Learning in Python

    https://www.youtube.com/watch?v=XeQBsidyhWE

    3. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow

    https://www.youtube.com/watch?v=ErfnhcEV1O8

    4. Deep Learning by Ian Goodfellow, Yoshua Bengio and Aaron Courville

    2

    https://www.youtube.com/watch?v=XeQBsidyhWEhttps://www.youtube.com/watch?v=ErfnhcEV1O8

  • Likelihood Interpretation

    Information Theory Interpretation

    3

  • Square Error Loss FunctionMinimize

  • Likelihood – Gaussian AssumptionMaximize

    The same answer since log() monotonically increasing function.

  • Cross Entropy Loss Function – Binary ClassificationMinimize

  • LikelihoodMaximize

    The same answer since log() monotonically increasing function.

  • Illustration

  • Likelihood Interpretation

    Information Theory Interpretation

    9

  • Number of Bits Needed to Encode • Information entropy is the average bit rate at which information is

    produced by a stochastic source of data.

    Claude Shannon Ludwig Boltzmann

  • Number of Bits when Mismatched

    • Cross entropy between two probability distributions p and qmeasures the average number of bits needed to identify an event drawn from the set if a coding scheme used for the set is optimized for an estimated probability distribution q, rather than the true distribution p.

    • Minimal cross entropy is achieved when the p and q distributions are identical, i.e., when cross entropy becomes entropy.