.. _usersGuide_23_advancedCorpus:

.. WARNING: DO NOT EDIT THIS FILE:
   AUTOMATICALLY GENERATED.
   PLEASE EDIT THE .py FILE DIRECTLY.



User's Guide, Chapter 23: Advanced Corpus and Metadata Searching
================================================================

.. code:: python

    from music21 import *

Creating multiple corpus repositories via local corpora
-------------------------------------------------------

In addition to the default local corpus, music21 allows users to create
and save as many named local corpora as they like, which will persist
from session to session.

Let's create a new *local* corpus, give it a directory to find music
files in, and then save it:

.. code:: python

    aNewLocalCorpus = corpus.corpora.LocalCorpus(u'A new corpus')
    aNewLocalCorpus.existsInSettings




.. parsed-literal::
   :class: ipython-result

    False



.. code:: python

    aNewLocalCorpus.addPath('~/Desktop')
    #_DOCS_SHOW aNewLocalCorpus.directoryPaths
    print("('/Users/josiah/Desktop',)") #_DOCS_HIDE


.. parsed-literal::
   :class: ipython-result

    ('/Users/josiah/Desktop',)


.. code:: python

    aNewLocalCorpus.save()
    aNewLocalCorpus.existsInSettings




.. parsed-literal::
   :class: ipython-result

    True



We can see that our new *local* corpus is saved by checking for the
names of all saved *local* corpora:

.. code:: python

    #_DOCS_SHOW corpus.corpora.LocalCorpus.listLocalCorporaNames()
    print("[None, u'trecento', u'A new corpus', u'bach', u'fake']") #_DOCS_HIDE


.. parsed-literal::
   :class: ipython-result

    [None, u'trecento', u'A new corpus', u'bach', u'fake']


..  note::

    When running ``listLocalCorporaNames()``, you will see ``None`` -
    indicating the default *local* corpus - along with the names of any
    non-default *local* corpora you've manually created yourself. In the above
    example, a number of other corpora have already been created.

Finally, we can delete the *local* corpus we previously created like
this:

.. code:: python

    aNewLocalCorpus.delete()
    aNewLocalCorpus.existsInSettings




.. parsed-literal::
   :class: ipython-result

    False



Inspecting metadata bundle search results
-----------------------------------------

Let's take a closer look at some search results:

.. code:: python

    bachBundle = corpus.corpora.CoreCorpus().search('bach', 'composer')
    bachBundle[0]




.. parsed-literal::
   :class: ipython-result

    <music21.metadata.bundles.MetadataEntry: bach_choraleAnalyses_riemenschneider001_rntxt>



.. code:: python

    bachBundle[0].sourcePath




.. parsed-literal::
   :class: ipython-result

    'bach/choraleAnalyses/riemenschneider001.rntxt'



.. code:: python

    bachBundle[0].metadataPayload




.. parsed-literal::
   :class: ipython-result

    <music21.metadata.RichMetadata at 0x111cb97f0>



.. code:: python

    mdpl = bachBundle[0].metadataPayload
    mdpl.noteCount




.. parsed-literal::
   :class: ipython-result

    60



.. code:: python

    bachAnalysis0 = bachBundle[0].parse()
    bachAnalysis0.show()




.. image:: usersGuide_23_advancedCorpus_17_0.png



Manipulating multiple metadata bundles
--------------------------------------

Another useful feature of ``music21``'s metadata bundles is that they
can be operated on as though they were sets, allowing you to union,
intersect and difference multiple metadata bundles, thereby creating
more complex search results:

.. code:: python

    corelliBundle = corpus.search('corelli', field='composer')
    corelliBundle




.. parsed-literal::
   :class: ipython-result

    <music21.metadata.bundles.MetadataBundle {1 entry}>



.. code:: python

    bachBundle.union(corelliBundle)




.. parsed-literal::
   :class: ipython-result

    <music21.metadata.bundles.MetadataBundle {22 entries}>



Consult the API for class:\ ``~music21.metadata.bundles.MetadataBundle``
for a more in depth look at how this works.

Getting a metadata bundle
-------------------------

In music21, metadata is information *about* a score, such as its
composer, title, initial key signature or ambitus. A metadata *bundle*
is a collection of metadata pulled from an arbitrarily large group of
different scores. Users can search through metadata bundles to find
scores with certain qualities, such as all scores in a given corpus with
a time signature of ``6/8``, or all scores composed by Monteverdi.

There are a number of different ways to acquire a metadata bundle. The
easiest way to get the metadataBundle for the core corpus is simply to
download music21: we include a pre-made metadataBundle (in
``corpus/metadataCache/core.json``) so that this step is unnecessary for
the core corpus unless you're contributing to the project. But you may
want to create metadata bundles for your own local corpora. Access the
``metadataBundle`` attribute of any ``Corpus`` instance to get its
corresponding metadata bundle:

.. code:: python

    coreCorpus = corpus.corpora.CoreCorpus()
    coreCorpus.metadataBundle




.. parsed-literal::
   :class: ipython-result

    <music21.metadata.bundles.MetadataBundle 'core': {14483 entries}>



Music21 also provides a handful of convenience methods for getting
metadata bundles associated with the *virtual*, *local* or *core*
corpora:

.. code:: python

    coreBundle = metadata.bundles.MetadataBundle.fromCoreCorpus()
    localBundle = metadata.bundles.MetadataBundle.fromLocalCorpus()
    otherLocalBundle = metadata.bundles.MetadataBundle.fromLocalCorpus('blah')
    virtualBundle = metadata.bundles.MetadataBundle.fromVirtualCorpus()

We strongly recommend using the above ``from*()`` methods. Some of these
metadata bundles can become quite large, and methods like
``fromCoreCorpus()`` will cache the metadata bundle in memory once it
has been read from disk, potentially saving you a lot of time.

But really advanced users can also make metadata bundles manually, by
passing in the name of the corpus you want the bundle to refer to, or,
equivalently, an actual ``Corpus`` instance itself:

.. code:: python

    coreBundle = metadata.bundles.MetadataBundle('core')
    coreBundle = metadata.bundles.MetadataBundle(corpus.corpora.CoreCorpus())

However, you'll need to read the bundle's saved data from disk before
you can do anything useful with the bundle. Bundles don't read their
associated JSON files automatically when they're manually instantiated.

.. code:: python

    coreBundle




.. parsed-literal::
   :class: ipython-result

    <music21.metadata.bundles.MetadataBundle 'core': {0 entries}>



.. code:: python

    coreBundle.read()




.. parsed-literal::
   :class: ipython-result

    <music21.metadata.bundles.MetadataBundle 'core': {14483 entries}>



Creating persistent metadata bundles
------------------------------------

Metadata bundles can take a long time to create. So it'd be nice if they
could be written to and read from disk. Unfortunately we never got
around to...nah, just kidding. Of course you can. Just call ``.write()``
on one:

.. code:: python

    coreBundle = metadata.bundles.MetadataBundle('core')
    coreBundle.read()




.. parsed-literal::
   :class: ipython-result

    <music21.metadata.bundles.MetadataBundle 'core': {14483 entries}>



.. code:: python

    #_DOCS_SHOW coreBundle.write()

They can also be completely rebuilt, as you will want to do for local
corpora. To add information to a bundle, use the ``addFromPaths()``
method:

.. code:: python

    newBundle = metadata.bundles.MetadataBundle()
    paths = corpus.corpora.CoreCorpus().getBachChorales()
    #_DOCS_SHOW failedPaths = newBundle.addFromPaths(paths)
    failedPaths = [] #_DOCS_HIDE
    failedPaths




.. parsed-literal::
   :class: ipython-result

    []



then call ``.write()`` to save to disk

.. code:: python

    #_DOCS_SHOW newBundle
    print("<music21.metadata.bundles.MetadataBundle {402 entries}>") # did not actually run addFromPaths... #_DOCS_HIDE


.. parsed-literal::
   :class: ipython-result

    <music21.metadata.bundles.MetadataBundle {402 entries}>


..  note::

    Building metadata information can be an incredibly intensive process. For
    example, building the *core* metadata bundle can easily take as long as four
    hours! And this is even though the building process uses multiple cores. 
    Please use caution, and be patient, when building metadata bundles
    from large corpora. To monitor the corpus-building progress, make sure to
    set 'debug' to True in your user settings:

    >>> #_DOCS_SHOW environment.UserSettings()['debug'] = True

You can delete, rebuild and save a metadata bundle in one go with the
``rebuild()`` method:

.. code:: python

    virtualBundle = metadata.bundles.MetadataBundle.fromVirtualCorpus()
    #_DOCS_SHOW virtualBundle.rebuild()

The process of rebuilding will store the file as it goes (for safety) so
at the end there is no need to call ``.write()``.

To delete a metadata bundle's cached-to-disk JSON file, use the
``delete()`` method:

.. code:: python

    #_DOCS_SHOW virtualBundle.delete()

Deleting a metadata bundle's JSON file won't empty the in-memory
contents of that bundle. For that, use ``clear()``:

.. code:: python

    virtualBundle.clear()