Companion python library for the machine learning book Feature Engineering & Selection for Explainable Models: A Second Course for Data Scientists. It is used for baseline correction. It has below 3 methods for baseline removal from spectra.
- Modpoly Modified multi-polynomial fit [1]. It has below 3 parameters.
-
degree
, it refers to polynomial degree, and default value is 2. -
repitition
, it refers to how many iterations to run, and default value is 100. -
gradient
, it refers to gradient for polynomial loss, default is 0.001. It measures incremental gain over each iteration. If gain in any iteration is less than this, further improvement will stop.
- IModPoly Improved ModPoly[2], which addresses noise issue in ModPoly. It has below 3 parameters.
-
degree
, it refers to polynomial degree, and default value is 2. -
repitition
, it refers to how many iterations to run, and default value is 100. -
gradient
, it refers to gradient for polynomial loss, and default is 0.001. It measures incremental gain over each iteration. If gain in any iteration is less than this, further improvement will stop.
- ZhangFit Zhang fit[3], which doesn’t require any user intervention and prior information, such as detected peaks. It has below 3 parameters.
-
lambda_
, it can be adjusted by user. The larger lambda is, the smoother the resulting background. Default value is 100. -
porder
refers to adaptive iteratively reweighted penalized least squares for baseline fitting. Default value is 1. -
repitition
is how many iterations to run, and default value is 15.
We can use the python library to process spectral data through either of the techniques ModPoly, IModPoly or Zhang fit algorithm for baseline subtraction. The functions will return baseline-subtracted spectrum.
from BaselineRemoval import BaselineRemoval
input_array=[10,20,1.5,5,2,9,99,25,47]
polynomial_degree=2 #only needed for Modpoly and IModPoly algorithm
baseObj=BaselineRemoval(input_array)
Modpoly_output=baseObj.ModPoly(polynomial_degree)
Imodpoly_output=baseObj.IModPoly(polynomial_degree)
Zhangfit_output=baseObj.ZhangFit()
print('Original input:',input_array)
print('Modpoly base corrected values:',Modpoly_output)
print('IModPoly base corrected values:',Imodpoly_output)
print('ZhangFit base corrected values:',Zhangfit_output)
Original input: [10, 20, 1.5, 5, 2, 9, 99, 25, 47]
Modpoly base corrected values: [-1.98455800e-04 1.61793368e+01 1.08455179e+00 5.21544654e+00
7.20210508e-02 2.15427531e+00 8.44622093e+01 -4.17691125e-03
8.75511661e+00]
IModPoly base corrected values: [-0.84912125 15.13786196 -0.11351367 3.89675187 -1.33134142 0.70220645
82.99739548 -1.44577432 7.37269705]
ZhangFit base corrected values: [ 8.49924691e+00 1.84994576e+01 -3.31739230e-04 3.49854060e+00
4.97412948e-01 7.49628529e+00 9.74951576e+01 2.34940300e+01
4.54929023e+01
pip install BaselineRemoval
Md Azimul Haque (2022). Feature Engineering & Selection for Explainable Models: A Second Course for Data Scientists. Lulu Press, Inc.
- Automated Method for Subtraction of Fluorescence from Biological Raman Spectra by Lieber & Mahadevan-Jansen (2003)
- Automated Autofluorescence Background Subtraction Algorithm for Biomedical Raman Spectroscopy by Zhao, Jianhua, Lui, Harvey, McLean, David I., Zeng, Haishan (2007)
- Baseline correction using adaptive iteratively reweighted penalized least squares by Zhi-Min Zhang, Shan Chena and Yi-Zeng Liang (2010)