2025年极限学习机原理及Python实现(ELM)

极限学习机原理及Python实现(ELM)最近在看黄广斌教授的 ELM 原文 但是原文中的代码链接已经失效 各种途径搜索终于找到了黄教授当时写的代码 但由于是 Matlab 版本 诸多不便 于是自己动手写了一个 Python 版本 基本原理 极限学习机 Extreme Learning Machine 简称 ELM

大家好,我是讯享网,很高兴认识大家。

最近在看黄广斌教授的ELM原文,但是原文中的代码链接已经失效,各种途径搜索终于找到了黄教授当时写的代码,但由于是Matlab版本,诸多不便,于是自己动手写了一个Python版本。

基本原理

极限学习机(Extreme Learning Machine,简称ELM)是一种单隐层神经网络,它的主要特点是隐含层节点的权重和偏置系数可以随机生成,生成之后就不再调整,唯一需要确定的是输出层的权重系数 β \boldsymbol \beta β

假设隐含层输出为 H \boldsymbol H H,输出层的输出(最后输出)为 T \boldsymbol T T,则输出层的权重系数 β \boldsymbol \beta β满足: H β = T \boldsymbol H\boldsymbol \beta = \boldsymbol T Hβ=T β = H † T \boldsymbol \beta=\boldsymbol H^\dagger\boldsymbol T β=HT,其中 H † \boldsymbol H^\dagger H为广义逆。

极限学习机的网络结构很简单,如下图所示。
ELM网络结构
讯享网
其中 w \boldsymbol w w b \boldsymbol b b分别是隐含层节点的权重系数矩阵和偏置系数向量, β \boldsymbol \beta β是输出层的权重系数矩阵。
假设现在有一组样本 ( X , T ) (\boldsymbol X, \boldsymbol T) (X,T) X \boldsymbol X X表示数据,维度为 n × d n\times d n×d,其中 n n n为样本个数, d d d为特征个数。
输入矩阵
T \boldsymbol T T表示标签,维度为 n × t n\times t n×t
输出矩阵
如果隐含层节点个数为 m m m,则权重系数矩阵维度为 d × m d\times m d×m
权重系数矩阵
偏置系数矩阵维度为 1 × m 1\times m 1×m
在这里插入图片描述

记隐含层节点输出为 H \boldsymbol H H H = g ( X w + b ) \boldsymbol H = g(\boldsymbol X \boldsymbol w + \boldsymbol b) H=g(Xw+b),其中 g ( ⋅ ) g(·) g()为激活函数,心细推导的同学可能会发现这里的矩阵维度根部不能相加,一个矩阵加上一个向量,是的!但是在numpy中可以,利用numpy的广播机制可以实现维度相等的矩阵和向量的相加,这里只是偷懒一点点,同时也方便原理的理解,矩阵表示如下:
隐层输出矩阵
据此,输出矩阵 T = H β \boldsymbol T=\boldsymbol H \boldsymbol \beta T=Hβ

代码实现

代码实现的过程与上述原理有一点区别的就是隐含层系数矩阵 b \boldsymbol b b,原理推导的过程是一个向量,但是实际过程中为了方便,生成了一个矩阵。
使用的时候将下面代码保存为文件myELM.py

# -*- coding:utf-8 -*- # keep learning, build yourself. # author: Mobius # 2023/4/14 9:00 import numpy as np from numpy.linalg import pinv from sklearn.preprocessing import OneHotEncoder class ELM: def __init__(self, hiddenNodeNum, activationFunc="sigmoid", type_="CLASSIFIER"): # beta矩阵 self.beta = None # 偏置矩阵 self.b = None # 权重矩阵 self.W = None # 隐含层节点个数 self.hiddenNodeNum = hiddenNodeNum # 激活函数 self.activationFunc = self.chooseActivationFunc(activationFunc) # 极限学习机类别 :CLASSIFIER->分类, REGRESSION->回归 self.type_ = type_ def fit(self, X, T): if self.type_ == "REGRESSION": try: if T.shape[1] > 1: raise ValueError("回归问题的输出维度必须为1") except IndexError: # 如果数据是一个array,则转换为列向量 T = np.array(T).reshape(-1, 1) if self.type_ == "CLASSIFIER": # 独热编码器 encoder = OneHotEncoder() # 将输入的T转换为独热编码的形式 T = encoder.fit_transform(T.reshape(-1, 1)).toarray() # 输入维度d,输出维度m,样本个数N,隐含层节点个数hiddenNodeNum n, d = X.shape # 权重系数矩阵 d*hiddenNodeNum self.W = np.random.uniform(-1.0, 1.0, size=(d, self.hiddenNodeNum)) # 偏置系数矩阵 n*hiddenNodeNum self.b = np.random.uniform(-0.4, 0.4, size=(1, self.hiddenNodeNum)) # 隐含层输出矩阵 n*hiddenNodeNum H = self.activationFunc(np.dot(X, self.W) + self.b) # 输出权重系数 hiddenNodeNum*m,β的计算公式为:((H.T*H)^-1)*H.T*T self.beta = np.dot(np.dot(pinv(np.dot(H.T, H)), H.T), T) def chooseActivationFunc(self, activationFunc): """选择激活函数,这里返回的值是函数名""" if activationFunc == "sigmoid": return self._sigmoid elif activationFunc == "sin": return self._sine elif activationFunc == "cos": return self._cos def predict(self, x): h = self.activationFunc(np.dot(x, self.W) + self.b) res = np.dot(h, self.beta) if self.type_ == "REGRESSION": # 回归预测 return res elif self.type_ == "CLASSIFIER": # 分类预测 # 返回最大值所在位置的索引,因为最大值位置的类别恰好等于索引 return np.argmax(res, axis=1) @staticmethod def score(y_true, y_pred): # 根据输出标签相等的个数计算得分 if len(y_pred) != len(y_true): raise ValueError("维度不相等") totalNum = len(y_pred) rightNum = np.sum([1 if p == t else 0 for p, t in zip(y_pred, y_true)]) return rightNum / totalNum @staticmethod def RMSE(y_pred, y_true): # Root Mean Square Error 均方根误差 # 这里计算平均均方根误差 # 计算公式参考:https://blog.csdn.net/yql_/article/details/ try: if y_pred.shape[1] == 1: y_pred = y_pred.reshape(-1) except IndexError: pass return np.sqrt(np.sum(np.square(y_pred - y_true)) / len(y_pred)) @staticmethod def _sigmoid(x): return 1.0 / (1 + np.exp(-x)) @staticmethod def _sine(x): return np.sin(x) @staticmethod def _cos(x): return np.cos(x) 

讯享网

测试案例

基于自己编写的ELM进行测试,分别用于回归和分类测试。

回归

讯享网# 使用波士顿房价数据集进行回归测试 import numpy as np from sklearn.model_selection import train_test_split import pandas as pd from elm import ELM from sklearn.preprocessing import StandardScaler data = pd.read_csv("./housing.csv", delim_whitespace=True, header=None) # 标准化处理 scaler = StandardScaler() X = data.iloc[:, :-1] # 将数据标准化处理 X = scaler.fit_transform(X) # 输出值不作处理 y = data.iloc[:, -1] # 随机划分数据集 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25) elm = ELM(hiddenNodeNum=20, activationFunc="sigmoid", type_="REGRESSION") elm.fit(X_train, y_train) y_pred = elm.predict(X_test) rmse = elm.RMSE(y_pred, y_test) print("平均RMSE为:", rmse) 
平均RMSE为: 5.7914 

分类

讯享网import numpy as np from sklearn.model_selection import train_test_split from elmimport ELM from sklearn import datasets # ------------------------------------------------------------- # # # 采用鸢尾花数据集进行分类测试 # 导入文件 iris = datasets.load_iris() X = iris.data y = iris.target # 划分数据集和测试集 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25) elm = ELM(hiddenNodeNum=20, activationFunc="sigmoid", type_="CLASSIFIER") elm.fit(X_train, y_train) y_pred = elm.predict(X_test) print("训练得分:", elm.score(y_test, y_pred)) 

输出

训练得分: 0.89473 

由于系数矩阵每次都是随机生成的,所以每次的准确率都会有所变化。

其中,波士顿房价数据集是从kaggle官网下载的。

附录
黄教授代码:

讯享网REGRESSION=0; CLASSIFIER=1; %%%%%%%%%%% Load training dataset train_data=load(TrainingData_File); T=train_data(:,1)'; P=train_data(:,2:size(train_data,2))'; clear train_data; % Release raw training data array %%%%%%%%%%% Load testing dataset test_data=load(TestingData_File); TV.T=test_data(:,1)'; TV.P=test_data(:,2:size(test_data,2))'; clear test_data; % Release raw testing data array NumberofTrainingData=size(P,2); NumberofTestingData=size(TV.P,2); NumberofInputNeurons=size(P,1); if Elm_Type~=REGRESSION %%%%%%%%%%%% Preprocessing the data of classification sorted_target=sort(cat(2,T,TV.T),2); label=zeros(1,1); % Find and save in 'label' class label from training and testing data sets label(1,1)=sorted_target(1,1); j=1; for i = 2:(NumberofTrainingData+NumberofTestingData) if sorted_target(1,i) ~= label(1,j) j=j+1; label(1,j) = sorted_target(1,i); end end number_class=j; NumberofOutputNeurons=number_class; %%%%%%%%%% Processing the targets of training temp_T=zeros(NumberofOutputNeurons, NumberofTrainingData); for i = 1:NumberofTrainingData for j = 1:number_class if label(1,j) == T(1,i) break; end end temp_T(j,i)=1; end T=temp_T*2-1; %%%%%%%%%% Processing the targets of testing temp_TV_T=zeros(NumberofOutputNeurons, NumberofTestingData); for i = 1:NumberofTestingData for j = 1:number_class if label(1,j) == TV.T(1,i) break; end end temp_TV_T(j,i)=1; end TV.T=temp_TV_T*2-1; end start_time_train=cputime; %%%%%%%%%%% Random generate input weights InputWeight (w_i) and biases BiasofHiddenNeurons (b_i) of hidden neurons InputWeight=rand(NumberofHiddenNeurons,NumberofInputNeurons)*2-1; BiasofHiddenNeurons=rand(NumberofHiddenNeurons,1); tempH=InputWeight*P; clear P; % Release input of training data  ind=ones(1,NumberofTrainingData); BiasMatrix=BiasofHiddenNeurons(:,ind); % Extend the bias matrix BiasofHiddenNeurons to match the demention of H tempH=tempH+BiasMatrix; %%%%%%%%%%% Calculate hidden neuron output matrix H switch lower(ActivationFunction) case { 
   'sig','sigmoid'} %%%%%%%% Sigmoid  H = 1 ./ (1 + exp(-tempH)); case { 
   'sin','sine'} %%%%%%%% Sine H = sin(tempH); case { 
   'hardlim'} %%%%%%%% Hard Limit H = double(hardlim(tempH)); case { 
   'tribas'} %%%%%%%% Triangular basis function H = tribas(tempH); case { 
   'radbas'} %%%%%%%% Radial basis function H = radbas(tempH); %%%%%%%% More activation functions can be added here  end clear tempH; % Release the temparary array for calculation of hidden neuron output matrix H %%%%%%%%%%% Calculate output weights OutputWeight (beta_i) OutputWeight=pinv(H') * T'; % implementation without regularization factor //refer to 2006 Neurocomputing paper end_time_train=cputime; TrainingTime=end_time_train-start_time_train; % Calculate CPU time (seconds) spent for training ELM %%%%%%%%%%% Calculate the training accuracy Y=(H' * OutputWeight)'; % Y: the actual output of the training data if Elm_Type == REGRESSION TrainingAccuracy=sqrt(mse(T - Y)) % Calculate training accuracy (RMSE) for regression case end clear H; %%%%%%%%%%% Calculate the output of testing input start_time_test=cputime; tempH_test=InputWeight*TV.P; clear TV.P; % Release input of testing data  ind=ones(1,NumberofTestingData); BiasMatrix=BiasofHiddenNeurons(:,ind); tempH_test=tempH_test + BiasMatrix; switch lower(ActivationFunction) case { 
   'sig','sigmoid'} %%%%%%%% Sigmoid  H_test = 1 ./ (1 + exp(-tempH_test)); case { 
   'sin','sine'} %%%%%%%% Sine H_test = sin(tempH_test); case { 
   'hardlim'} %%%%%%%% Hard Limit H_test = hardlim(tempH_test); case { 
   'tribas'} %%%%%%%% Triangular basis function H_test = tribas(tempH_test); case { 
   'radbas'} %%%%%%%% Radial basis function H_test = radbas(tempH_test); %%%%%%%% More activation functions can be added here  end TY=(H_test' * OutputWeight)'; % TY: the actual output of the testing data end_time_test=cputime; TestingTime=end_time_test-start_time_test; if Elm_Type == REGRESSION TestingAccuracy=sqrt(mse(TV.T - TY)) % Calculate testing accuracy (RMSE) for regression case end if Elm_Type == CLASSIFIER %%%%%%%%%% Calculate training & testing classification accuracy MissClassificationRate_Training=0; MissClassificationRate_Testing=0; for i = 1 : size(T, 2) [x, label_index_expected]=max(T(:,i)); [x, label_index_actual]=max(Y(:,i)); if label_index_actual~=label_index_expected MissClassificationRate_Training=MissClassificationRate_Training+1; end end TrainingAccuracy=1-MissClassificationRate_Training/size(T,2); for i = 1 : size(TV.T, 2) [x, label_index_expected]=max(TV.T(:,i)); [x, label_index_actual]=max(TY(:,i)); if label_index_actual~=label_index_expected MissClassificationRate_Testing=MissClassificationRate_Testing+1; end end TestingAccuracy=1-MissClassificationRate_Testing/size(TV.T,2); end 
小讯
上一篇 2025-04-09 08:28
下一篇 2025-01-04 22:39

相关推荐

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/46564.html