Churro: Simple Filesystem Object Persistence¶
Churro is a simplistic persistent storage for Python objects which stores a tree of hierchically nested objects as folders and flat files in a filesystem. Churro uses AcidFS to provide ACID transaction semantics. Changes to any Churro object tree are only persisted when a transaction successfully commits. Churro uses JSON to serialize objects. Churro is meant to be lightweight and durable. Use of JSON, a universally understood and human readable text file format, insures that data stored by Churro is portable to other applications and platforms across space and time.
In addition to these docs, it couldn’t hurt to look over the AcidFS documentation.
Defining Persistent Types¶
In order for an object to be saved in a Churro repository, it must inherit
from Persistent
or
PersistentFolder
. Attributes of your
persistent objects that you want to be persisted must be derived from
PersistentProperty
. Probably the best way to illustrate is by
example, so let’s say you’re writing an application that saves contacts in an
address book. We might write some code that looks like this:
from churro import Persistent
from churro import PersistentProperty
from churro import PersistentFolder
class AddressBook(PersistentFolder):
title = PersistentProperty()
def __init__(self, title):
self.title = title
class Contact(Persistent):
name = PersistentProperty()
address = PersistentProperty()
def __init__(self, name, address):
self.name = name
self.address = address
You can see that defining your persistent types is pretty straightforward. Next you’ll want to open a repository and start storing some data.
Adding Objects to the Repository¶
from churro import Churro
repo = Churro('/path/to/folder')
root = repo.root()
contacts = AddressBook('My Contacts')
root['contacts'] = contacts
contacts['fred'] = Contact('Fred Flintstone', '1 Rocky Road')
contacts['barney'] = Contact('Barney Rubble', '6 Bronto Lane')
Above, we create an instance of Churro
where the
argument is the folder in the filesystem where the repository will live. If the
folder does not exist, it will be created and an empty repository will be
initialized. Otherwise an existing repository will be opened. The call to
repo.root()
gets the root folder of the repository,
the starting point for traversing to any other objects in the repository. From
there, adding data to the repository is as easy as instantiating data objects
using folders as Python dicts.
Committing a Transaction¶
So far no data has actually been stored yet. You’ll need to commit a transaction:
import transaction
transaction.commit()
Note
If you’re using Pyramid, you should avoid committing the transaction yourself and use pyramid_tm. For other WSGI frameworks there is also repoze.tm2.
Persistent Properties¶
PersistentProperty
and its subclasses are responsible for
serializing individual attributes of your Python objects to JSON.
PersistentProperty
can handle values of any type natively
serializable to JSON. These include strings, booleans, numbers, lists, and
dictionaries. Persistent properties can also hold as values other
Persistent
objects, allowing objects to be nested inside of
each other.
Two additional property types, PersistentDate
and
PersistentDatetime
are included for storing datetime.date and
datetime.datetime objects respectively.
For other types you’ll need to provide a means for converting the type to
something serializable by JSON and then converting back to a Python object.
This is done by extending PersistentProperty
and overriding
the to_json()
,
from_json()
, and
validate()
methods. The following is an
actual example from Churro code that illustrates this:
import datetime
class PersistentDate(PersistentProperty):
def from_json(self, value):
if value:
return datetime.date(*map(int, value.split('-')))
return value
def to_json(self, value):
if value:
return '%s-%s-%s' % (value.year, value.month, value.day)
return value
def validate(self, value):
if value is not None and not isinstance(value, datetime.date):
raise ValueError("%s is not an instance of datetime.date")
return value
You can use the new property type in your class definitions:
class Contact(Persistent):
name = PersistentProperty()
address = PersistentProperty()
birthday = PersistentDate()
def __init__(self, name, address):
self.name = name
self.address = address
Mutable Property Values¶
Churro automatically keeps track of which objects have been mutated and saves those objects at transaction commit time. Churro does this by keeping track of when a setter is called on a property and marking that object as dirty. So simply assigning a value to a property will cause that object to get persisted at commit time:
daniela.birthday = datetime.date(2010, 5, 12)
You can find yourself in a situation, however, where the assigned value is a mutable structure and instead of assigning a new value to the property you simply mutate the structure. Let’s say that we add a list of friends to our Contact class:
class Contact(Persistent):
name = PersistentProperty()
address = PersistentProperty()
birthday = PersistentDate()
friends = PersistentProperty()
def __init__(self, name, address):
self.name = name
self.address = address
self.friends = []
If we have a Contact instance that is clean and the only change we make is to add a friend to the list, Churro will not detect the mutation and the change will not be persisted at commit time:
# This change won't be persisted
daniela.friends.append('Katy')
One way to get around this problem is to call the
set_dirty()
method on the object that needs to be
saved:
# Unless you call this method
daniela.set_dirty()
This brute force method is always available, whatever you’re doing. Churro
does, however, provide helpers for the two most common types of mutable data,
dicts and lists. These are PersistentDict
and
PersistentList
respectively. We could rewrite the example
above to a use a PersistentList
instead of a plain Python list:
from churro import PersistentList
class Contact(Persistent):
name = PersistentProperty()
address = PersistentProperty()
birthday = PersistentDate()
friends = PersistentProperty()
def __init__(self, name, address):
self.name = name
self.address = address
self.friends = PersistentList()
Now you don’t need to call set_dirty()
when adding a
friend to a contact’s friend list:
# Don't need to call set_dirty, this change will be persisted
daniela.friends.append('Silas')
API Reference¶
-
class
churro.
Churro
(repo, head='HEAD', factory=None, create=True, bare=False)¶ Constructor Arguments
repo
The path to the repository in the real, local filesystem.head
The name of a branch to use as the head for this transaction. Changes made using this instance will be merged to the given head. The default, if omitted, is to use the repository’s current head.factory
A callable that returns the root database object to be stored as the root when creating a new database. The default factory returns an instance of churro.PersistentFolder. This has no effect if the repository has already been created.create
If there is not a Git repository in the indicated directory, should one be created? The default is True.bare
If the Git repository is to be created, create it as a bare repository. If the repository is already created or create is False, this argument has no effect.-
flush
()¶ Writes any unsaved data to the underlying AcidFS filesystem without committing the transaction.
-
root
()¶ Gets the root folder of the repository. This is the starting point for traversing to other objects in the repository.
-
-
class
churro.
Persistent
¶ This is the base class from which all persistent classes for Churro must be derived. Only objects which are instances of a class derived from Persistent may be stored in a Churro repository.
-
deactivate
()¶ Calling this method on a persistent object detaches that object, and its children, from the in memory persistent object tree, potentially allowing it to be garbage collected if there are no other references to the object.
-
set_dirty
()¶ Calling this method alerts Churro that this object is dirty and should be persisted at commit time. It is usually not necesary to call this method from application code, since Churro tries to detect object mutation whenever possible. You may need to call this method from your application code, however, if you use mutable data structures that are not themselves Persistent as values of persistent properties, as Churro has no way of detecting mutations to those structures.
-
-
class
churro.
PersistentFolder
¶ Classes which derive from this class are not only persistent in Churro but have dict-like properties allowing them to contain children which are other persistent objects or folders. Storing an instance of PersistentFolder in a Churro repository, creates a folder in the underlying filesystem, in which child objects are stored. Instances of PersistentFolder are dict-like and are interacted with in the same way as standard Python dictionaries.
-
get
(name, default=None)¶ Returns the child object of the given name. Returns default if the child is not found.
-
items
()¶ Returns an iterator over (child object’s name, child object) tuples.
-
keys
()¶ Returns the names of child objects.
-
remove
(name)¶ Removes the child with the given name from the folder. Raises KeyError if there is no child with the given name.
-
values
()¶ Returns an iterator over child objects.
-
-
class
churro.
PersistentDict
(*args)¶ A PersistentDict is a Python dict work alike that marks its parent object as dirty whenever it is mutated, solving the problem of using mutable datastructures as values for persistent properties with Churro and eliminating the need to call
set_dirty()
in application code when updating the dictionary.
-
class
churro.
PersistentList
(*args)¶ A PersistentList is a Python dict work alike that marks its parent object as dirty whenever it is mutated, solving the problem of using mutable datastructures as values for persistent properties with Churro and eliminating the need to call
set_dirty()
in application code when updating the list.
-
class
churro.
PersistentProperty
¶ The base type for all persistent properties. This property type can handle any data type as a value that is serializable natively to JSON. Other types are implemented by extending this class and overriding the from_json, to_json, and validate methods.
-
from_json
(value)¶ Converts a value from its JSON representation to a Python object.
-
to_json
(value)¶ Converts a value from a Python object to an object that can be serialized as JSON.
-
validate
(value)¶ Used at assignment time to validate a value. If a value is not of the proper type and cannot be converted to the proper type, a ValueError is raised, otherwise the valua is returned, including any transformation or coercion that has been performed.
-
-
class
churro.
PersistentDate
¶ A persistent attribute type that can store instances of datetime.date.
-
class
churro.
PersistentDatetime
¶ A persistent attribute type that can store instances of datetime.datetime.