-
- Description:
Python module to ease pattern classification analyses of large datasets. It provides high-level abstraction of typical processing steps (e.g. data preparation, classification, feature selection, generalization testing), a number of implementations of some popular algorithms (e.g. kNN, Ridge Regressions, Sparse Multinomial Logistic Regression, GPR. RFE, I-RELIEF), and bindings to external ML libraries (libsvm, shogun, R). While it is not limited to neuroimaging data (e.g. FMRI) it is eminently suited for such datasets.
It is actively developed project, thus you might better off trying it from the version control system which contains 0.5 development going forward. See http://dev.pymvpa.org for the upcoming website/documentation. Please see documentation on how to obtain and "build" from sources.
- Changes to previous version:
- 2.0.0 (Mon, Dec 19 2011)
This release aggregates all the changes occurred between official releases in 0.4 series and various snapshot releases (in 0.5 and 0.6 series). To get better overview of high level changes see :ref:
release notes for 0.5 <chap_release_notes_0.5>
and :ref:0.6 <chap_release_notes_0.6>
as well as summaries of release candidates belowFixes (23 BF commits)
- significance level in the right tail was fixed to include the value tested -- otherwise resulted in optimistic bias (or absurdly high significance in improbable case if all estimates having the same value)
- compatible with the upcoming IPython 0.12 and renamed sklearn (Fixes #57)
-
do not double-train
slave
classifiers while assessing sensitivities (Fixes #53)
Enhancements (30 ENH + 3 NF commits)
- resolving voting ties in kNN based on mean distance, and randomly in SMLR
-
:class:
kNN
'sca.estimates
now contains dictionaries with votes for each class -
consistent zscoring in :class:
Hyperalignment
2.0.0~rc5 (Wed, Oct 19 2011)
Major: to allow easy co-existence of stable PyMVPA 0.4.x, 0.6 development
mvpa
module was renamed into mod:mvpa2
.Fixes
- compatible with the new Shogun 1.x series
- compatible with the new h5py 2.x series
- mvpa-prep-fmri -- various compatibility fixes and smoke testing
-
deepcopying :class:
SummaryStatistics
during add
Enhancements
-
tutorial uses :mod:
mvpa2.tutorial_suite
now - better suppression of R warnings when needed
- internal attributes of many classes were exposed as properties
-
more unification of
__repr__
for many classes
-
tutorial uses :mod:
0.6.0~rc4 (Wed, Jun 14 2011)
Fixes
-
Finished transition to :mod:
nibabel
conventions inplot_lightbox
-
Addressed :mod:
matplotlib.hist
API change -
Various adjustments in the tests batteries (:mod:
nibabel
1.1.0 compatibility, etc)
-
Finished transition to :mod:
New functionality
-
Explicit new argument
flatten
to from_wizard -- default behavior changed if mapper was provided as well
-
Explicit new argument
Enhancements
-
Elaborated
__str__
and__repr__
for some Classifiers and Measures
-
Elaborated
0.6.0~rc3 (Thu, Apr 12 2011)
Fixes
- Bugfixes regarding the interaction of FlattenMapper and BoxcarMapper that affected event-related analyses.
-
Splitter
now handles attribute valueNone
for splitting properly. -
GNBSearchlight
handling of
roi_ids
. -
More robust detection of mod:
scikits.learn
and :mod:nipy
externals.
New functionality
-
Added a
Repeater
node to yield a dataset multiple times and
Sifter
node to exclude some datasets. Consequently, the "nosplitting" mode ofSplitter
got removed at the same time. -
:file:
tools/niils
-- little tool to list details (dimensionality, scaling, etc) of the files in nibabel-supported formats.
-
Added a
Enhancements
- Numerous documentation fixes.
- Various improvements and increased flexibility of null distribution estimation of Measures.
- All attribute are now reported in sorted order when printing a dataset.
-
fmri_dataset
now also stores the input image type. -
Crossvalidation
can now take a customSplitter
instance. Moreover, the default splitter of CrossValidation is more robust in terms of number and type of created splits for common usage patterns (i.e. together with partitioners). -
CrossValidation
takes any custom Node aserrorfx
argument. -
ConfusionMatrix
can now be used as anerrorfx
in Crossvalidation. -
LOE(ACC): Linear Order Effect in ACC
was added to
ConfusionMatrix
to detect trends in performances across splits. -
A
Node
s postproc is now accessible as a property. -
RepeatedMeasure
has a new 'concat_as' argument that allows results to be concatenated along the feature axis. The default behavior, stacking as multiple samples, is unchanged. -
Searchlight
now has the ability to mark the center/seed of an ROI in with a feature attribute in the generated datasets. -
debug
takesargs
parameter for delayed string comprehensions. It should reduce run-time impact ofdebug()
calls in regular, non-O
mode of Python operation. -
String summaries and representations (provided by
__str__
and__repr__
) were made more exhaustive and more coherent. Additional properties to access initial constructor arguments were added to variety of classes.
Internal changes
New debug target
STDOUT
to allow attaching metrics (e.g. traceback, timestamps) to regular output printed to stdoutNew set of decorators to help with unittests
@nodebug
to disable specific debug targets for the duration of the test.@reseed_rng
to guarantee consistent random data given initial seeding.@with_tempfile
to provide a tempfile name which would get removed upon completion (test success or failure)Dropping daily testing of
maint/0.5
branch -- RIP.Collection
s were provided with adequate(deep|)copy
. AndDataset
was refactored to useCollection
scopy
method.update-*
Makefile rules automatically should fast-forward correspondingwebsite-updates
branchMVPA_TESTS_VERBOSITY
controls also :mod:numpy
warnings now.Dataset.__array__
provides original array instead of copy (unless dtype is provided)
Also adapts changes from 0.4.6 and 0.4.7 (see corresponding changelogs).
0.6.0~rc2 (Thu, Mar 3 2011)
Various fixes in the mvpa.atlas module.
0.6.0~rc1 (Thu, Feb 24 2011)
Many, many, many
For an overview of the most drastic changes :ref:
see constantly evolving release notes for 0.6 <chap_release_notes_0.6>
0.5.0 (sometime in March 2010)
This is a special release, because it has never seen the general public. A summary of fundamental changes introduced in this development version can be seen in the :ref:
release notes <chap_release_notes_0.5>
.Most notably, this version was to first to come with a comprehensive two-day workshop/tutorial.
- 0.4.7 (Tue, Mar 07 2011) (Total: 12 commits)
A bugfix release
Fixed
-
Addressed the issue with input NIfTI files having
scl_
fields set: it could result in incorrect analyses and map2nifti-produced NIfTI files. Now input files account for scaling/offset ifscl_
fields direct to do so. Moreover upon map2nifti, those fields get reset. -
:file:
doc/examples/searchlight_minimal.py
- best error is the minimal one
-
Addressed the issue with input NIfTI files having
Enhancements
-
:class:
~mvpa.clfs.gnb.GNB
can now tolerate training datasets with a single label -
:class:
~mvpa.clfs.meta.TreeClassifier
can have trailing nodes with no classifier assigned
-
:class:
0.4.6 (Tue, Feb 01 2011) (Total: 20 commits)
A bugfix release
Fixed (few BF commits):
- Compatibility with numpy 1.5.1 (histogram) and scipy 0.8.0 (workaround for a regression in legendre)
- Compatibility with libsvm 3.0
-
:class:
~mvpa.clfs.plr.PLR
robustification
Enhancements
- Enforce suppression of numpy warnings while running unittests. Also setting verbosity >= 3 enables all warnings (Python, NumPy, and PyMVPA)
-
:file:
doc/examples/nested_cv.py
example (adopted from 0.5) -
Introduced base class :class:
~mvpa.clfs.base.LearnerError
for classifiers' exceptions (adopted from 0.5) - Adjusted example data to live upto nibabel's warranty of NIfTI standard-compliance
- More robust operation of MC iterations -- skip iterations where classifier experienced difficulties and raise an exception (e.g. due to degenerate data)
- BibTeX Entry: Download
- Corresponding Paper BibTeX Entry: Download
- Supported Operating Systems: Agnostic
- Data Formats: None
- Tags: Shogun, Python, Eeg, Classification, Regression, Support Vector Machines, K Nearest Neighbor Classification, Pca, Rfe, Neuroscience, Fmri, Framework, Gpr, Lars, Smlr, Meg
- Archive: download here
Other available revisons
-
Version Changelog Date 2.0.0 - 2.0.0 (Mon, Dec 19 2011)
This release aggregates all the changes occurred between official releases in 0.4 series and various snapshot releases (in 0.5 and 0.6 series). To get better overview of high level changes see :ref:
release notes for 0.5 <chap_release_notes_0.5>
and :ref:0.6 <chap_release_notes_0.6>
as well as summaries of release candidates belowFixes (23 BF commits)
- significance level in the right tail was fixed to include the value tested -- otherwise resulted in optimistic bias (or absurdly high significance in improbable case if all estimates having the same value)
- compatible with the upcoming IPython 0.12 and renamed sklearn (Fixes #57)
-
do not double-train
slave
classifiers while assessing sensitivities (Fixes #53)
Enhancements (30 ENH + 3 NF commits)
- resolving voting ties in kNN based on mean distance, and randomly in SMLR
-
:class:
kNN
'sca.estimates
now contains dictionaries with votes for each class -
consistent zscoring in :class:
Hyperalignment
2.0.0~rc5 (Wed, Oct 19 2011)
Major: to allow easy co-existence of stable PyMVPA 0.4.x, 0.6 development
mvpa
module was renamed into mod:mvpa2
.Fixes
- compatible with the new Shogun 1.x series
- compatible with the new h5py 2.x series
- mvpa-prep-fmri -- various compatibility fixes and smoke testing
-
deepcopying :class:
SummaryStatistics
during add
Enhancements
-
tutorial uses :mod:
mvpa2.tutorial_suite
now - better suppression of R warnings when needed
- internal attributes of many classes were exposed as properties
-
more unification of
__repr__
for many classes
-
tutorial uses :mod:
0.6.0~rc4 (Wed, Jun 14 2011)
Fixes
-
Finished transition to :mod:
nibabel
conventions inplot_lightbox
-
Addressed :mod:
matplotlib.hist
API change -
Various adjustments in the tests batteries (:mod:
nibabel
1.1.0 compatibility, etc)
-
Finished transition to :mod:
New functionality
-
Explicit new argument
flatten
to from_wizard -- default behavior changed if mapper was provided as well
-
Explicit new argument
Enhancements
-
Elaborated
__str__
and__repr__
for some Classifiers and Measures
-
Elaborated
0.6.0~rc3 (Thu, Apr 12 2011)
Fixes
- Bugfixes regarding the interaction of FlattenMapper and BoxcarMapper that affected event-related analyses.
-
Splitter
now handles attribute valueNone
for splitting properly. -
GNBSearchlight
handling of
roi_ids
. -
More robust detection of mod:
scikits.learn
and :mod:nipy
externals.
New functionality
-
Added a
Repeater
node to yield a dataset multiple times and
Sifter
node to exclude some datasets. Consequently, the "nosplitting" mode ofSplitter
got removed at the same time. -
:file:
tools/niils
-- little tool to list details (dimensionality, scaling, etc) of the files in nibabel-supported formats.
-
Added a
Enhancements
- Numerous documentation fixes.
- Various improvements and increased flexibility of null distribution estimation of Measures.
- All attribute are now reported in sorted order when printing a dataset.
-
fmri_dataset
now also stores the input image type. -
Crossvalidation
can now take a customSplitter
instance. Moreover, the default splitter of CrossValidation is more robust in terms of number and type of created splits for common usage patterns (i.e. together with partitioners). -
CrossValidation
takes any custom Node aserrorfx
argument. -
ConfusionMatrix
can now be used as anerrorfx
in Crossvalidation. -
LOE(ACC): Linear Order Effect in ACC
was added to
ConfusionMatrix
to detect trends in performances across splits. -
A
Node
s postproc is now accessible as a property. -
RepeatedMeasure
has a new 'concat_as' argument that allows results to be concatenated along the feature axis. The default behavior, stacking as multiple samples, is unchanged. -
Searchlight
now has the ability to mark the center/seed of an ROI in with a feature attribute in the generated datasets. -
debug
takesargs
parameter for delayed string comprehensions. It should reduce run-time impact ofdebug()
calls in regular, non-O
mode of Python operation. -
String summaries and representations (provided by
__str__
and__repr__
) were made more exhaustive and more coherent. Additional properties to access initial constructor arguments were added to variety of classes.
Internal changes
New debug target
STDOUT
to allow attaching metrics (e.g. traceback, timestamps) to regular output printed to stdoutNew set of decorators to help with unittests
@nodebug
to disable specific debug targets for the duration of the test.@reseed_rng
to guarantee consistent random data given initial seeding.@with_tempfile
to provide a tempfile name which would get removed upon completion (test success or failure)Dropping daily testing of
maint/0.5
branch -- RIP.Collection
s were provided with adequate(deep|)copy
. AndDataset
was refactored to useCollection
scopy
method.update-*
Makefile rules automatically should fast-forward correspondingwebsite-updates
branchMVPA_TESTS_VERBOSITY
controls also :mod:numpy
warnings now.Dataset.__array__
provides original array instead of copy (unless dtype is provided)
Also adapts changes from 0.4.6 and 0.4.7 (see corresponding changelogs).
0.6.0~rc2 (Thu, Mar 3 2011)
Various fixes in the mvpa.atlas module.
0.6.0~rc1 (Thu, Feb 24 2011)
Many, many, many
For an overview of the most drastic changes :ref:
see constantly evolving release notes for 0.6 <chap_release_notes_0.6>
0.5.0 (sometime in March 2010)
This is a special release, because it has never seen the general public. A summary of fundamental changes introduced in this development version can be seen in the :ref:
release notes <chap_release_notes_0.5>
.Most notably, this version was to first to come with a comprehensive two-day workshop/tutorial.
- 0.4.7 (Tue, Mar 07 2011) (Total: 12 commits)
A bugfix release
Fixed
-
Addressed the issue with input NIfTI files having
scl_
fields set: it could result in incorrect analyses and map2nifti-produced NIfTI files. Now input files account for scaling/offset ifscl_
fields direct to do so. Moreover upon map2nifti, those fields get reset. -
:file:
doc/examples/searchlight_minimal.py
- best error is the minimal one
-
Addressed the issue with input NIfTI files having
Enhancements
-
:class:
~mvpa.clfs.gnb.GNB
can now tolerate training datasets with a single label -
:class:
~mvpa.clfs.meta.TreeClassifier
can have trailing nodes with no classifier assigned
-
:class:
0.4.6 (Tue, Feb 01 2011) (Total: 20 commits)
A bugfix release
Fixed (few BF commits):
- Compatibility with numpy 1.5.1 (histogram) and scipy 0.8.0 (workaround for a regression in legendre)
- Compatibility with libsvm 3.0
-
:class:
~mvpa.clfs.plr.PLR
robustification
Enhancements
- Enforce suppression of numpy warnings while running unittests. Also setting verbosity >= 3 enables all warnings (Python, NumPy, and PyMVPA)
-
:file:
doc/examples/nested_cv.py
example (adopted from 0.5) -
Introduced base class :class:
~mvpa.clfs.base.LearnerError
for classifiers' exceptions (adopted from 0.5) - Adjusted example data to live upto nibabel's warranty of NIfTI standard-compliance
- More robust operation of MC iterations -- skip iterations where classifier experienced difficulties and raise an exception (e.g. due to degenerate data)
December 22, 2011, 01:36:32 0.4.5 0.4.5 (Fri, Oct 01 2010) (Total: 27 commits)
A bugfix release
* Fixed (13 BF commits): o Compatible with LIBSVM >= 2.91 (Closes: #583018) o No string exceptions raised (Python 2.6 compatibility) o Setting of shrinking parameter in sg interface o Deducing number of SVs for SVR (LIBSVM) o Correction of significance in the tails of non-parametric tests * Miscellaneous: o Development repository moved to http://github.com/PyMVPA/PyMVPA
October 2, 2010, 16:51:22 0.4.4 0.4.4 (Mon, Feb 2 2010) (Total: 144 commits)
Primarily a bugfix release, probably the last in 0.4 series since development for 0.5 release is leaping forward.
- New functionality (19 NF commits):
o GNB implements Gaussian Naïve Bayes Classifier.
o read_fsl_design() to read FSL FEAT design.fsf files (Contributed by Russell A. Poldrack).
o SequenceStats to provide basic statistics on labels sequence (counter-balancing, autocorrelation).
o New exceptions DegenerateInputError and FailedToTrainError to be thrown by classifiers primarily during training/testing.
o Debug target STATMC to report on progress of Monte-Carlo sampling (during permutation testing).
- Refactored (15 RF commits):
o To get users prepared to 0.5 release, internally and in some examples/documentation, access to states and parameters is done via corresponding collections, not from the top level object (e.g. clf.states.predictions instead of soon-to-be-deprecated clf.predictions). That should lead also to improved performance.
o Adopted copy.py from python2.6 (support Ellipsis as well). ed (38 BF commits):
o GLM output does not depend on the enabled states any more.
o Variety of docstrings fixed and/or improved.
o Do not derive NaN scaling for SVM’s C whenever data is degenerate (lead to never finishing SVM training).
o sg : + KRR is optional now – avoids crashing if KRR is not available.
tolerance to absent set_precompute_matrix in svmlight in recent shogun versions.
support for recent (present in 0.9.1) API change in exposing debug levels.
o Python 2.4 compatibility issues: kNN and IFS
February 7, 2010, 16:48:00 0.4.3 Online documentation editor is no longer available due to low demand – please submit changes via email.
Performance (Contributed by Valentin Haenel) (3 OPT commits):
- Further optimized LIBSVM bindings.
- Copy-if-sorted in selectFeatures.
New functionality (25 NF commits):
- ProcrusteanMapper with orthogonal and oblique transformations.
- Ability to generate simple reports using reportlab. See/run examples/match_distribution.py for example.
- TreeClassifier – construct simple hierarchies of classifiers.
- wtf() to report information about the system/PyMVPA to be included in the bug reports.
- Parameter ‘reverse’ to swap training/testing splits in Splitter .
- Example code for the analysis of event-related dataset using ERNiftiDataset.
- toEvents() to create lists of Event.
- mvpa-prep-fmri was extended with plotting of motion correction parameters.
- ColumnData can be explicitly told either file contains a header.
- In XMLBasedAtlas (e.g. fsl atlases) it is now possible to provide custom ‘image_file’ to get maps or indexes for the areas given an atlas’s volume registered into subject space.
- Updated included LIBSVM version to 2.89 and provided support for its “silencing”.
Refactored (27 RF commits):
- Dataset’s copy() with deep=False allows for shallow copying the dataset.
- FeatureSelectionClassifier s in warehouse not to reuse the same classifiers, but to use clones.
Fixed (70 BF commits):
- OneWayAnova: previously degrees of freedom were not considered while computing F-scores.
- Majority voting strategy in kNN: it was not working.
- Various fixes to ensure cross-platform building (numpy header locations, etc).
- Stability fixes in ConfusionMatrix.
- idsonboundaries(): samples at the end of the sequence were not handled properly.
- Proper “untraining” of FeatureSelectionClassifier s classifiers which use sensitivities: it could lead to various unpleasant side-effects if the same slave classifier was used simultaneously by multiple MetaClassifiers (like TreeClassifier).
September 8, 2009, 20:21:12 0.4.1 Initial Announcement on mloss.org.
May 18, 2008, 17:06:05
Comments
-
- Yaroslav Halchenko (on May 18, 2008, 17:07:37)
- It is actively developed project at the moment, thus it is preferable to don't rely on releases but rather use master branch of git repository mentioned on the project homepage
-
- Yaroslav Halchenko (on September 8, 2009, 20:21:46)
- 0.4.3 release update
-
- Yaroslav Halchenko (on September 8, 2009, 20:29:35)
- updated entry to don't be treated as PRE
Leave a comment
You must be logged in to post comments.