|
|
|
发新文章 |
|
|
http://bbs.sciencenet.cn/home.php?mod=space&uid=484653&do=blog&id=442300产生式模型(Generative Model)与判别式模型(Discrimitive Model)是分类器常遇到的概念,它们的区别在于: 对于输入x,类别标签y: 产生式模型估计它们的联合概率分布P(x,y) 判别式模型估计条件概率分布P(y|x) 产生式模型可以根据贝叶斯公式得到判别式模型,但反过来不行。 Andrew Ng在NIPS2001年有一篇专门比较判别模型和产生式模型的文章: On Discrimitive vs. Generative classifiers: A comparision of logistic regression and naive Bayes ( http://robotics.stanford.edu/~ang/papers/nips01-discriminativegenerative.pdf) http://blog.sciencenet.cn/home.php?mod=space&uid=248173&do=blog&id=227964 【摘要】 - 生成模型:无穷样本==》概率密度模型 = 产生模型==》预测 - 判别模型:有限样本==》判别函数 = 预测模型==》预测
【简介】 简单的说,假设o是观察值,q是模型。 如果对P(o|q)建模,就是Generative模型。其基本思想是首先建立样本的概率密度模型,再利用模型进行推理预测。要求已知样本无穷或尽可能的大限制。 这种方法一般建立在统计力学和bayes理论的基础之上。 如果对条件概率(后验概率) P(q|o)建模,就是Discrminative模型。基本思想是有限样本条件下建立判别函数,不考虑样本的产生模型,直接研究预测模型。代表性理论为统计学习理论。 这两种方法目前交叉较多。
【判别模型Discriminative Model】——inter-class probabilistic description
又可以称为条件模型,或条件概率模型。估计的是条件概率分布(conditional distribution), p(class|context)。 利用正负例和分类标签,focus在判别模型的边缘分布。目标函数直接对应于分类准确率。
- 主要特点: 寻找不同类别之间的最优分类面,反映的是异类数据之间的差异。 - 优点: 分类边界更灵活,比使用纯概率方法或生产模型得到的更高级。 能清晰的分辨出多类或某一类与其他类之间的差异特征 在聚类、viewpoint changes, partial occlusion and scale variations中的效果较好 适用于较多类别的识别 判别模型的性能比生成模型要简单,比较容易学习 - 缺点: 不能反映训练数据本身的特性。能力有限,可以告诉你的是1还是2,但没有办法把整个场景描述出来。 Lack elegance of generative: Priors, 结构, 不确定性 Alternative notions of penalty functions, regularization, 核函数 黑盒操作: 变量间的关系不清楚,不可视 - 常见的主要有: logistic regression SVMs traditional neural networks Nearest neighbor Conditional random fields(CRF): 目前最新提出的热门模型,从NLP领域产生的,正在向ASR和CV上发展。
- 主要应用: Image and document classification Biosequence analysis Time series prediction
【生成模型Generative Model】——intra-class probabilistic description
又叫产生式模型。估计的是联合概率分布(joint probability distribution),p(class, context)=p(class|context)*p(context)。
用于随机生成的观察值建模,特别是在给定某些隐藏参数情况下。在机器学习中,或用于直接对数据建模(用概率密度函数对观察到的draw建模),或作为生成条件概率密度函数的中间步骤。通过使用贝叶斯rule可以从生成模型中得到条件分布。
如果观察到的数据是完全由生成模型所生成的,那么就可以fitting生成模型的参数,从而仅可能的增加数据相似度。但数据很少能由生成模型完全得到,所以比较准确的方式是直接对条件密度函数建模,即使用分类或回归分析。
与描述模型的不同是,描述模型中所有变量都是直接测量得到。
- 主要特点: 一般主要是对后验概率建模,从统计的角度表示数据的分布情况,能够反映同类数据本身的相似度。 只关注自己的inclass本身(即点左下角区域内的概率),不关心到底 decision boundary在哪。 - 优点: 实际上带的信息要比判别模型丰富, 研究单类问题比判别模型灵活性强 模型可以通过增量学习得到 能用于数据不完整(missing data)情况 modular construction of composed solutions to complex problems prior knowledge can be easily taken into account robust to partial occlusion and viewpoint changes can tolerate significant intra-class variation of object appearance - 缺点: tend to produce a significant number of false positives. This is particularly true for object classes which share a high visual similarity such as horses and cows 学习和计算过程比较复杂 - 常见的主要有: Gaussians, Naive Bayes, Mixtures of multinomials Mixtures of Gaussians, Mixtures of experts, HMMs Sigmoidal belief networks, Bayesian networks Markov random fields
所列举的Generative model也可以用disriminative方法来训练,比如GMM或HMM,训练的方法有EBW(Extended Baum Welch),或最近Fei Sha提出的Large Margin方法。
- 主要应用: NLP: Traditional rule-based or Boolean logic systems (Dialog and Lexis-Nexis) are giving way to statistical approaches (Markov models and stochastic context grammars) Medical Diagnosis: QMR knowledge base, initially a heuristic expert systems for reasoning about diseases and symptoms been augmented with decision theoretic formulation Genomics and Bioinformatics Sequences represented as generative HMMs
【两者之间的关系】 由生成模型可以得到判别模型,但由判别模型得不到生成模型。 Can performance of SVMs be combined elegantly with flexible Bayesian statistics? Maximum Entropy Discrimination marries both methods: Solve over a distribution of parameters (a distribution over solutions)
【参考网址】 http://prfans.com/forum/viewthread.php?tid=80 http://hi.baidu.com/cat_ng/blog/item/5e59c3cea730270593457e1d.html http://en.wikipedia.org/wiki/Generative_model http://blog.csdn.net/yangleecool/archive/2009/04/05/4051029.aspx
================== 比较三种模型:HMMs and MRF and CRF
http://blog.sina.com.cn/s/blog_4cdaefce010082rm.html
HMMs(隐马尔科夫模型): 状态序列不能直接被观测到(hidden); 每一个观测被认为是状态序列的随机函数; 状态转移矩阵是随机函数,根据转移概率矩阵来改变状态。 HMMs与MRF的区别是只包含标号场变量,不包括观测场变量。
MRF(马尔科夫随机场) 将图像模拟成一个随机变量组成的网格。 其中的每一个变量具有明确的对由其自身之外的随机变量组成的近邻的依赖性(马尔科夫性)。
CRF(条件随机场),又称为马尔可夫随机域 一种用于标注和切分有序数据的条件概率模型。 从形式上来说CRF可以看做是一种无向图模型,考察给定输入序列的标注序列的条件概率。
在视觉问题的应用: HMMs:图像去噪、图像纹理分割、模糊图像复原、纹理图像检索、自动目标识别等 MRF: 图像恢复、图像分割、边缘检测、纹理分析、目标匹配和识别等 CRF: 目标检测、识别、序列图像中的目标分割
P.S. 标号场为隐随机场,它描述像素的局部相关属性,采用的模型应根据人们对图像的结构与特征的认识程度,具有相当大的灵活性。 空域标号场的先验模型主要有非因果马尔可夫模型和因果马尔可夫模型。
夏季用气注意三关一开 |
| 信息来源:合肥日报 日期:2011-8-3 | 本报讯 (胡娟 王倩) 记者从合肥燃气集团了解到,近日,有市民反映燃气灶接口处的软管变形问题,对此燃气集团工作人员及时上门帮助其更换了软管。天气炎热,燃气集团提醒市民注 意夏季用气安全,使用燃气时,人不要远离,防止锅中液体溢出,将火焰扑灭造成安全隐患。一旦发现险情,请及时拨打合肥燃气集团蓝焰热线。 夏季炎热的气温容易导致胶皮管老化,从接口脱落,造成漏气,燃气集团工作人员提醒市民最好每两年更换一次胶管。天然气胶管连接着灶具和燃气管道,用户一定要使用专用的橡胶燃气胶管,不能用其他软管替代。 同时,保持厨房通风十分重要。夏季居民开空调时,都习惯将窗户关严实,一旦发生燃气泄漏会很危险。“居民最好将厨房的窗户打开,然后关上厨房的门,这样既 可以起到室内密封的效果,又可以使厨房的空气流通。”除此之外,平时使用完燃气后,要注意“三关一开”,即关闭灶具开关、灶前阀和厨房门,打开厨房窗户。 用户还应该经常检查天然气器具和管道周围有无堆放易燃、易爆物品。 此外,夏季也是装修旺季,有些用户为了美观,在装修时会选择将燃气立管包裹住,这样既不能保证通风,出现问题也不便维修,用户应该选择能通风或者可拆卸的 方式进行处理。同时,根据《城镇燃气管理条例》规定,用户不得擅自安装、改装、拆卸室内管道燃气设施或者进行危害室内管道燃气设施安全的装饰、装修等活 动。 |
原来看的子空间学习论文均是PCA/LDA+NN,即最近邻,其实分类器也能用SVM。即PCA/LDA+SVM,见马毅论文SRC的Fig 8(d),PCA/LDA属于模式识别系统的特征抽取步骤,SVM属于分类步骤,两者是独立开来的,可以任意组合。
调研记录 Feiping Nie and Shiming Xiang(20130903),将他们的论文2009-2013的标题都看了,不必再调研。最新的论文分别是 Efficient Image Classification via Multiple Rank Regression
和Nonparametric Illumination Correction have seen, no need to see again
Robust Classification via Structured Sparse Representation (CVPR 2011) Patch alignment need to see: 非传统人脸识别
Coupled Discriminant Analysis for Heterogeneous Face Recognition Discriminative Multimanifold Analysis for Face Recognition from a Single Training Sample per Person (TPAMI 2013 Feature article)
A General Iterative Shrinkage and Thresholding Algorithm for(ICML 13,有code) Similarity Component Analysis Unsupervised and Semi-Supervised Learning via ℓ1-Norm Graph (Feiping Nie,有code) Local Structure-based Image Decomposition for Feature Extraction with Applications to Face Recognition (TIP) Sparse representation classifier steered discriminant projections (TNNLS 2013) Tumor Classification Based on Non-Negative Matrix(chunhou zheng) L-2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning(IJCAI,有code) Towards structural sparsity An explicit l2 l0 approach (主要看下该文Lipchitz辅助函数怎么用的,Chris Ding的讲稿"sparseBeijing_Christ Ding"第28页ppt提到了) Manifold Adaptive Experimental Design for Text Categorization Deng Cai(TKDE 2012,有code) Sparse concept coding for visual analysis (CVPR 2011,有code) A nove lSVM+NDA (Pattern recognition) ICDM 2010 L2/L0-norm,包括Chris Ding的讲稿 (2011) R. Jenatton, J.-Y. Audibert and F. Bach. Structured Variable Selection with Sparsity-Inducing Norms. Journal of Machine Learning Research, 12(Oct):2777-2824. ( 20120312开始的一周
和libing 计划将这篇论文看完)Robust Sparse Coding for Face Recognition (discuss with libing, he said he has understood totally) Feature selection Linear Discriminant Dimensionality Reduction(ECML 2011) Generalized Fisher Score for Feature Selection(UAI 2011) 有空再看的论文:Extreme Learning Machine for Regression and Multiclass Classification(TSMCB 2012) gains: 1、know how to derive formula (13) in SRC (Sparse representation classifier(稀疏表示分类器), Yi Ma, TPAMI 2009); know how to derive formula (14) in “Efficient and Robust Feature Selection via Joint L2,1-Norms Minimization (NIPS 2010)”. The key points are formulas (16) and (17). 2、know how to derive from formulas (10) to (12) in "R1-PCA Rotational invariant L1-norm principal component analysis for robust subspace factorization (ICML 2006)"
http://wenku.baidu.com/view/cc9b4308bb68a98271fefa6f.html (Matlab北航教程) CH 7.6 函数句柄 函数句柄是matlab的一个数据类型,保存函数的路径、视野、函数名及重载方式等。 使用函数句柄的优点 1.使一些泛函指令工作更可靠 2.使函数调用象变量引用一样方便 3.可获得同名重载函数的信息 4.可在更大范围内调用各种函数,提高软件的重用性 5.提高函数调用速度。 一、函数句柄的创建与观察 1.创建 handlef=@fname; handlef=str2func(‘fname’) 例如:fhandle = @sin; sin是matlab中自带的正弦函数,得到的输出变量fhandle为sin函数的句柄。可以利用fhandle来调用sin函数,例如下面的代码: fhandle(0) 上面语句得到的输出代码如下: ans = 0 实际上,该程序中的语句fhandle(0)相当于语句sin(0)。 二、函数句柄的应用 [out1,ou2,…]=fname(in1,in2,…) 也可通过函数句柄来完成函数运算: [out1,ou2,…]= handlef(in1,in2,…) [out1,ou2,…]=feval(handlef,in1,in2,…) http://www.ilovematlab.cn/thread-23048-1-1.html matlab 函数句柄@的介绍_什么是函数句柄 觉得自己很少用函数句柄,但是经常遇到,所以在这里总结一下。 函数句柄:是包含了函数的路径、函数名、类型以及可能存在的重载方法。 函数句柄必须通过专门的定义创建的,而一般的图像的句柄是自动建立的。 创建函数句柄使用@或者str2func命令创建 执行sin函数 feval feval('sin',pi/2) %查matlab帮助 feval 既可以,可以不必关心这个函数的使用 ans = 1 那么使用函数句柄有什么好处呢? 1、提高运行速度。因为matlab对函数的调用每次都是要搜索所有的路径,从set path中我们可以看到,路径是非常的多的,所以如果一个函数在你的程序中需要经常用到的话,使用函数句柄,对你的速度会有提高的。 2、使用可以与变量一样方便。比如说,我再这个目录运行后,创建了本目录的一个函数句柄,当我转到其他的目录下的时候,创建的函数句柄还是可以直接调用的,而不需要把那个函数文件拷贝过来。因为你创建的function handles中,已经包含了路径,比如说我创建了一个fun h_fun=str2func('rei'); 可以用functions来查看这个function,结果果然已经包括了路径。 functions(h_fun) ans = function: 'rei' type: 'simple' file: 'G:\program\serial232\rei.m'
Past conference Conference | Deadline of Paper Submission | Notation Acceptance | CVPR 2012 | Nov. 21, 2011 | March 2, 2012 | ICML 2012 | February 24, 2012 | April 30, 2012 | IJCAI 2011 (once two years) | | | AAAI 2012 | | March 28, 2012 | | | | ICPR 2012 | Mar. 31, 2012 | Jun. 15, 2012 | ECML 2012 | Abstract deadline: Thu 19 April 2012; Paper deadline: Mon 23 April 2012 | Early author: Mon 28 May 2012; Author notification: Fri 15 June 2012 | NIPS 2012 | June 1, 2012 | | Recent conference Conference | Deadline of Paper Submission | Notation Acceptance | NIPS 2013 | | | IJCAI 2013 (once two years) | Abstract submission: January 26, 2013 (11:59PM, UTC-12). Paper submission: January 31, 2013 (11:59PM, UTC-12). | | AAAI 2013 (once a year) | - December 3, 2012 – January 19, 2013: Authors register on the AAAI web site
- January 19, 2013 (11:59 PM PST): Electronic abstracts due
- January 22, 2013 (11:59 PM PST): Electronic papers due
| | ICML 2013 | | | CVPR 2013 | November 15, 2012 | | Some Website: NIPS 2012: https://cmt.research.microsoft.com/NIPS2012/
------------------------------------https://sites.google.com/site/feipingnie/resource ------------------------------------
focus jounals: TPAMI, TNN, TIP, TKDE, TCSVT, TMM, TSMC, TIFS, JMLR, Neural Computation, IJCV, PR, PRL, Neurocomputing focus conferences: NIPS, ICML, AIStat, CVPR, ICCV, ECCV, IJCAI, AAAI, KDD, SIGIR, ACMMM, ECML, ICDM, SDM, CIKM, ICIP, ICPR, ACCV conferences Conf. deadline NIPS June 8, 2007 SIGMM June 2, 2007 ICDM June 1, 2007 ACCV April 27, 2007 ICCV April 10, 2007 KDD February 28th, 2007 ICML February 9, 2007 AAAI February 6, 2007 IJCNN January 31, 2007 SIGIR January 28, 2007 ICIP January 19, 2007 ICME January 5, 2007 CVPR December 3, 2006
(1) 2011年CVPR Longuet-Higgins Prize(奖励在CVPR领域经典的文章):
- Paul A. Viola and Michael J. Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features", CVPR 2001.[该文在google sholar已经引用5187次 ]
Code of feature selection: http://featureselection.asu.edu/software.php; Rongxiang Hu采用的是孙即祥老师模式识别教材P235的"增添特征法",他感觉效果比较不错。他的判据:分类准确率;第一个特征可以随机选择,他的选择方法:选择单独最好的那个特征. 基于score的方法应该都属于Filter方法,比如Fisher Score和Laplacian Score,因为这些方法都是对每个特征直接计算一个分数,没有依赖于特定的学习方法。
http://www.postech.ac.kr/~seungjin/submitted.html
Recenlty Done
-
Minje Kim, Jiho Yoo, Kyeongok Kang, and Seungjin Choi (2010), "Nonnegative matrix partial co-factorization for spectral and temporal drum source separation," submitted to IEEE JSTSP, September 30, 2010. ( "major revision in 4 weeks," December 21, 2010 ) ( "revised," February 3, 2011 ) ( "accepted," May 12, 2011 ) (ETRI=1/2, CMEST=1/2, WCU)
- Shounan An, Jiho Yoo, and Seungjin Choi (2010),
"Manifold-respecting discriminant nonnegative matrix factorization," submitted to Pattern Recognition Letters, April 10, 2010. ( "major revision in 4 months," October 13, 2010 ) ( "revised," January 10, 2011 ) ( "accepted," January 20, 2011 ) (MTF=1/4, CMEST=1/4, VIEW=1/4, CoSDEC=1/4, WCU)
- Yongsoo Kim, Taek-Kyun Kim, Yungu Kim, Jiho Yoo, Sung Yong Yoo,
Seungjin Choi, and Daehee Hwang (2010), "Principal network analysis: Identification of subnetworks representing major dynamics using gene expression data," submitted to Bioinformatics, September 8, 2010. ( "major revision in 4 months," October 5, 2010 ) ( "revised," November 12, 2010 ) ( "accepted," December 1, 2010 ) (NCRC=1/4, CORE=1/4, WCU)
- Jong Kyoung Kim and Seungjin Choi (2009),
"Probabilistic models for semi-supervised discriminative motif discovery in DNA sequences," submitted to IEEE/ACM TCBB, July 15, 2009. ( "major revision in 3 months," October 4, 2009 ) ( "revised," October 30, 2009 ) ( "minor revision by March 20, 2010," December 22, 2009 ) ( "revised," January 2, 2010 ) ( "accepted," January 29, 2010 ) (NCRC=1, WCU)
- Jiho Yoo and Seungjin Choi (2008),
"Orthogonal nonnegative matrix tri-factorization for co-clustering: Multiplicative updates on Stiefel manifolds," submitted to Information Processing and Management, September 15, 2008. ( "major revision, " June 18, 2009 ) ( "revised," August 20, 2009 ) ( "accept with minor revisions," December 9, 2009 ) ( "revised," December 21, 2009 ) ( "accepted," December 27, 2009 ) (MTF=1/2, CMEST=1/2, WCU, MSRA)
- Hyekyoung Lee, Jiho Yoo, and Seungjin Choi (2009),
"Semi-supervised nonnegative matrix factorization," submitted to IEEE Signal Processing Letters, April 28, 2009 ( "accept with mandatory minor revisions," June 9, 2009 ) ( "revised," June 13, 2009 ) ( "accepted," June 16, 2009 ) (MTF=1/3, CMEST=1/3, VIEW=1/3, WCU, MSRA)
- Young Min Oh, Jong Kyoung Kim, Yongwook Choi, Seungjin Choi, and Joo-Yeon Yoo (2008),
"Prediction and experimental validation of novel STAT3 target genes in human cancer cells," submitted to PLoS ONE, April 3, 2009. ( "revised," July 4, 2009 ) ( "minor-revised," July 20, 2009 ) ( "accepted," August 4, 2009 ) (NCRC=1, WCU)
- Hyohyeong Kang, Yunjun Nam, and Seungjin Choi (2009),
"Composite common spatial pattern for subject-to-subject transfer," submitted to IEEE Signal Processing Letters, February 20, 2009. ( "accept with mandatory minor revisions," March 18, 2009 ) ( "revised," April 9, 2009 ) ( "accepted," Apirl 11, 2009 ) (NCRC=1, WCU)
- Hyekyoung Lee, Andrzej Cichocki, and Seungjin Choi (2008),
"Kernel nonnegative matrix factorization for spectral EEG feature extraction," submitted to Neurocomputing, June 25, 2008. ( "accept with minor revision," February 7, 2009 ) ( "revised," February 18, 2009 ) ( "accepted," March 8, 2009 ) (SAFE=1/2, NCRC=1/2)
- Hyekyoung Lee and Seungjin Choi (2008),
"Group nonnegative matrix factorization for EEG classification," submitted to AISTATS-2009, November 1, 2008. (Notification of acceptance, January 9, 2009) ( "accepted for poster presentation," January 9, 2009 ) (NCRC=1/2, SAFE=1/2)
- Seunghak Lee and Seungjin Choi (2008),
"Landmark MDS ensemble," submitted to Pattern Recognition, March 30, 2008. ("reconsider, if revised," by September 14 ) ("revised," August 5, 2008 ) ( "reconsider, if revised," by October 24 ) ( "2nd-revised," October 21, 2008 ) ( "accept if revised," by December 20, 2008 ) ( "3rd-revised," November 24, 2008 ) ( "accepted," November 26, 2008 ) (CMEST=1/2, SAFE=1/2)
|