It’s also important that you base your guess of what’s probably good to read next only on the previous thing you read. Forget everything that came before that. |

If there’s a finite amount of literature on the subject, this advice will send the OP in circles with probability one. |

It’s hilarious because it’s also a “semi”-decent method on how to learn knew topics in general. |

You can then model the probability that you’ll end up at any one given place after n steps as a markov chain. |

Markov chains in essence are simple. Instead of diverging and reading all the theory, I’d recommend do it on a need basis. Learn as you go. So pick up a problem and move ahead. I don’t think it is fruitful to just learn everything about Markov Chains just for the sake of it. Markov Chain Monte Carlo to sample from probability distributions is a good start – https://arxiv.org/abs/1206.1901 if you are into sampling. |

Unless you’re interested in continuous markov chains, infinite state spaces, renewal theory, excessive functions, and so on. |

You don’t need much math to pick up the very basic theory, but after a certain point you’re going to hit a hard wall unless you have a strong background in analysis. |

I think when you get out of the basic linear algebra and calculus prerequisite and into the analysis and measure theory prerequisite nothing takes a few months anymore 🙂 |

That whole math sequence was part of my MBA program that culminated in Markov chains for synthetic options pricing after like, 9 months. And this is for business school students; not engineers 🙂 |

The wikipedia page for Markov chains is really one of the best wikipedia pages I’ve ever seen for a technical topic. Covers a ton of ground, and gives concrete examples to motivate the ideas. |

Personally, I started with Eugene Charniak’s Statistical Language Learning [1] then continued with Manning and Schütze’s Foundations of Statistical Natural Language Processing [2] and Speech and Language Processing by Jurafsky and Martin [3]. The Charniak book is primarily about HMMs and quite short, so it’s the best introduction to the subject. Manning and Schütze and Jurafsky and Martin are much more extensive and cover pretty much all of statistical NLP up to their publication date (so no LSTMs if I remember correctly) but they are required reading for an in-depth approach. You will definitely want to go beyond HMMs at some point, so you will probably want the other two books. But, if you really just want to know about HMMs, then start with the Charniak. ______________ [1] https://mitpress.mit.edu/books/statistical-language-learning |

Highly recommended. Preferred way to learn is to grasp an intuitive understanding before diving deep into theory. This visual explainer is great first step. |

How about a textbook maybe? There aren’t always easy alternatives out there, sometimes you have to bite the bullet and do the work |

If the “motivation-theorem-proof” style appeals to you, find a copy of |

Do you have an application in mind to help guide suggestions? As others have said, if you know know probability, start there. |

You can find a copy of “Markov Chains and Mixing Times” online, which is good and relatively accessible. |

Just pick a random place to start, read some stuff, and then take a guess as to which direction to go in next, based on what’s probably a good next thing to read. Then keep repeating the process over and over again.