Python中的输入输出隐马尔可夫模型实现
Posted
技术标签:
【中文标题】Python中的输入输出隐马尔可夫模型实现【英文标题】:Input Output Hidden Markov Model Implementation in Python 【发布时间】:2019-10-31 15:14:30 【问题描述】:我正在尝试使用输入输出架构实现隐马尔可夫模型,但我找不到任何好的 Python 实现。
任何人都可以分享 Python 包并考虑 HMM 的以下实现。
允许连续排放。 允许协变量的功能(即 I/O HMM 中的自变量)。
此时,我正在努力寻找相同的 python 实现。
我在 hmmlearn 中找不到相关示例。
以下是我测试过的几个库:
hmmlearn:hmmlearn 允许将多个特征传递给发射/观察,但不提供包含协变量(即自变量)的支持。
hmms:不支持添加连续发射的功能,也不支持添加自变量。
IOHMM:我能够使用此库训练 HMM 模型,但在训练模型后找不到用于进行预测的文档。
因此,我正在寻找符合目的的包。
from IOHMM import UnSupervisedIOHMM
from IOHMM import OLS, DiscreteMNL, CrossEntropyMNL, forward_backward
SHMM = UnSupervisedIOHMM(num_states=3, max_EM_iter=200, EM_tol=1e-6)
SHMM.set_models(model_emissions = [OLS(est_stderr=True)],
model_transition=CrossEntropyMNL(solver='lbfgs'),
model_initial=CrossEntropyMNL(solver='lbfgs'))
SHMM.set_inputs(covariates_initial = [], covariates_transition = [], covariates_emissions = [['Insulin']])
SHMM.set_outputs([['Glucose']])
SHMM.set_data([data])
SHMM.train()
经过上述训练,我无法弄清楚如何获得发射概率和隐藏状态序列。
【问题讨论】:
如果你能解释你是如何得到输出的,那将会很有帮助。底层逻辑是什么? 输出只是排放/观察。 hmmlearn 的哪一部分你不明白?这里没有人会为您编写示例,因为 a)我们不为人们编写代码 b)您甚至没有给我们任何迹象表明您自己尝试过 嗨,克里斯,感谢您的意见。我已经编辑了这个问题,以便更好地理解这个疑问。 根据github.com/Mogeng/IOHMM/blob/master/examples/notebooks/…,您只需要SHMM.model_emissions
即可获得排放量
【参考方案1】:
参考“https://web.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/tutorial%20on%20hmm%20and%20applications.pdf”和库“https://hmmlearn.readthedocs.io/en/latest/”我找到了这个解决方案:
1-通过log_gamma(后验分布):
state_sequences = []
for i in range(100):
for j in range(lengths[i]):
state_sequences.append(np.argmax(np.exp(SHMM.log_gammas[i])[j]))
pred_state_seq = [state_sequences[df[df['unit'] == i].index[0]:df[df['unit'] == i].index[-1] + 1] for i in
range(1, df_A['unit'].max() + 1)]
2- 维特比算法:
from hmmlearn import _hmmc
transmat = np.empty((num_states, num_states))
for i in range(num_states):
transmat = np.concatenate((transmat, np.exp(SHMM.model_transition[i].predict_log_proba(np.array([[]])))))
transmat = transmat[num_states:]
startprob = np.exp(SHMM.model_initial.predict_log_proba(np.array([[]]))).squeeze()
def log_mask_zero(a):
"""
Compute the log of input probabilities masking divide by zero in log.
Notes
-----
During the M-step of EM-algorithm, very small intermediate start
or transition probabilities could be normalized to zero, causing a
*RuntimeWarning: divide by zero encountered in log*.
This function masks this unharmful warning.
"""
a = np.asarray(a)
with np.errstate(divide="ignore"):
return np.log(a)
def _do_viterbi_pass(framelogprob):
n_samples, n_components = framelogprob.shape
state_sequence, logprob = _hmmc._viterbi(n_samples, n_components, log_mask_zero(startprob),
log_mask_zero(transmat), framelogprob)
return logprob, state_sequence
def _decode_viterbi(X):
framelogprob = SHMM.log_Eys[X]
return _do_viterbi_pass(framelogprob)
def decode():
decoder = "viterbi": _decode_viterbi["viterbi"]
logprob = 0
sub_state_sequences = []
for sub_X in range(100):
# XXX decoder works on a single sample at a time!
sub_logprob, sub_state_sequence = decoder(sub_X)
logprob += sub_logprob
sub_state_sequences.append(sub_state_sequence)
return logprob, np.concatenate(sub_state_sequences)
def predict():
"""
Find most likely state sequence corresponding to ``X``.
Parameters
----------
X : array-like, shape (n_samples, n_features)
Feature matrix of individual samples.
lengths : array-like of integers, shape (n_sequences, ), optional
Lengths of the individual sequences in ``X``. The sum of
these should be ``n_samples``.
Returns
-------
state_sequence : array, shape (n_samples, )
Labels for each sample from ``X``.
"""
logprob, state_sequence = decode()
return logprob, state_sequence
_, state_seq = predict()
pred_state_seq = [state_seq[df[df['unit'] == i].index[0]:df[df['unit'] == i].index[-1] + 1] for i in
range(1, df_A['unit'].max() + 1)]
【讨论】:
以上是关于Python中的输入输出隐马尔可夫模型实现的主要内容,如果未能解决你的问题,请参考以下文章
机器学习算法之——隐马尔可夫(Hidden Markov ModelsHMM)原理及Python实现