pydiffx.dom.objects#

The DiffX Object Model.

This is a set of classes that comprise the DiffX Object Model. These consistent of container and content section classes, each with type-safe properties used to manage content and options for the DiffX file.

The only object that should be created manually is DiffX. The others are created automatically or when calling DiffX.add_change() or DiffXChangeSection.add_file().

Classes

BaseDiffXContainerSection([parent_section])

Base class for container sections.

BaseDiffXContentSection(**kwargs)

Base class for content sections.

BaseDiffXSection([parent_section])

Base class for a DiffX section.

DiffX([parent_section])

Representation of a DiffX file.

DiffXChangeSection([parent_section])

A change section within a DiffX file.

DiffXFileDiffSection(**kwargs)

A diff content section.

DiffXFileSection([parent_section])

A file section within a change section.

DiffXMetaSection(**kwargs)

A metadata section.

DiffXPreambleSection(**kwargs)

A preamble section.

class pydiffx.dom.objects.BaseDiffXSection(parent_section=None, **attrs)#

Bases: object

Base class for a DiffX section.

This manages option storage and controls the initialization process for the subclass.

options#

The options set for this section. This can be manipulated directly without any type checking, but it’s recommended that consumers go through the dedicated class-level attributes.

Type

dict

section_id#

The ID of this section. This corresponds to a value in Section.

Type

unicode

section_name = None#

The name of the section.

This must be provided by subclasses.

Type:

unicode

default_options = {}#

Default options to set for the section.

These will be written to options when constructing the section if not otherwise provided by the caller.

Type:

dict

__init__(parent_section=None, **attrs)#

Initialize the section.

Parameters
  • parent_section (BaseDiffXContainerSection, optional) – The parent container section.

  • **attrs (dict) –

    Attributes to set for the section.

    This may consist of attributes representing options or content subsections.

    Any invalid option will raise a DiffXUnknownOptionError.

__eq__(other)#

Return whether this section is equal to another section.

Parameters

other (BaseDiffXSection) – The section to compare to.

Returns

True if the two sections are equal. False if they are not.

Return type

bool

__repr__()#

Return a string representation of this section.

Returns

The string representation.

Return type

unicode

__hash__ = None#
class pydiffx.dom.objects.BaseDiffXContainerSection(parent_section=None, **attrs)#

Bases: pydiffx.dom.objects.BaseDiffXSection

Base class for container sections.

Container sections contain additional container and/or content sections. They’re also responsible for setting options on the content sections.

Subclasses must explicitly set subsections.

subsections#

The list of subsections in this section.

Type

list of BaseDiffXSection

__eq__(other)#

Return whether this section is equal to another section.

Parameters

other (BaseDiffXSection) – The section to compare to.

Returns

True if the two sections are equal. False if they are not.

Return type

bool

__iter__()#

Iterate through the immediate subsections.

Yields

BaseDiffXSection – A subsection of this section.

__hash__ = None#
class pydiffx.dom.objects.BaseDiffXContentSection(**kwargs)#

Bases: pydiffx.dom.objects.BaseDiffXSection

Base class for content sections.

Content sections contain data in some form, indicated by data_type.

They cannot have subsections of their own.

Consumers will generally not need to access content sections directly. Instead, they’ll set data or options through the parent container class’s type-safe attributes, or during construction of the parent section.

data_type = None#

The type of data allowed for this section.

Type:

type

default_value = None#

Default value for the section.

Type:

object

__init__(**kwargs)#

Initialize the section.

Parameters

**kwargs (dict) – Keyword arguments to pass to the parent. See the documentation for details.

property content#

The content of this section.

The type will be that of data_type.

__eq__(other)#

Return whether this section is equal to another section.

Parameters

other (BaseDiffXSection) – The section to compare to.

Returns

True if the two sections are equal. False if they are not.

Return type

bool

__hash__ = None#
class pydiffx.dom.objects.DiffX(parent_section=None, **attrs)#

Bases: pydiffx.dom.properties.ContainerOptionsMixin, pydiffx.dom.properties.MetaOptionsMixin, pydiffx.dom.properties.PreambleOptionsMixin, pydiffx.dom.objects.BaseDiffXContainerSection

Representation of a DiffX file.

This represents a DiffX file as a hierarchical series of objects and attributes. It can be used to construct a new DiffX file piece-by-piece before writing it out to a file or stream, or to read in an existing DiffX file for processing or manipulation.

Consumers will start by working directly with a DiffX instance.

When constructing one, they’ll need to add at least one change by using add_change(), and at least one file to that change.

When reading one, they can read the preamble, metadata, or list of files using the provided attributes.

encoding#

The default encoding for all preamble and metadata sections in the DiffX file. This may be None, in which case an encoding cannot be assumed.

Changing this will not affect the in-memory representation of any data, but it will affect how it’s written.

Type

unicode

meta#

Global metadata for the entire DiffX file.

Type

dict

meta_encoding#

Encoding used when reading/writing the metadata in the file.

See DiffXMetaSection.encoding.

Type

unicode

meta_section#

The actual metadata section. This will generally not be accessed directly.

Type

DiffXMetaSection

preamble#

The preamble content describing the entire series of changes in the DiffX file.

Type

unicode

preamble_encoding#

Encoding used when reading/writing the preamble content in the file.

See DiffXPreambleSection.encoding.

Type

unicode

preamble_indent#

Indentation applied to each line of preamble content.

See DiffXPreambleSection.indent.

Type

int

preamble_line_endings#

The type of line endings used in the preamble content.

See DiffXPreambleSection.line_endings.

Type

unicode

preamble_mimetype#

The mimetype representing the format of the preamble content.

See DiffXPreambleSection.mimetype.

Type

unicode

preamble_section#

The actual preamble section. This will generally not be accessed directly.

Type

DiffXPreambleSection

version#

The version of the DiffX file.

Only supported versions can be set.

Type:

unicode

classmethod from_bytes(data)#

Construct an instance from a DiffX file stored in a byte string.

Parameters

data (bytes) – The DiffX file contents to parse.

Returns

The resulting DiffX instance.

Return type

DiffX

Raises

pydiffx.errors.DiffXParseError – The DiffX contents could not be parsed. Details will be in the error message.

classmethod from_stream(stream)#

Construct an instance from a DiffX file read from a stream.

This will close the stream after it’s been read.

Parameters

data (file or io.IOBase) – The stream to read from.

Returns

The resulting DiffX instance.

Return type

DiffX

Raises

pydiffx.errors.DiffXParseError – The DiffX contents could not be parsed. Details will be in the error message.

property subsections#

A list of the preamble, meta, and change subsections.

Type:

list of BaseDiffXSection

add_change(**attrs)#

Add a new change section.

Parameters

**attrs (dict) – Attributes to set on the section. This may consist of any attributes listed on DiffXChangeSection.

Returns

The newly-added change section.

Return type

DiffXChangeSection

Raises

pydiffx.errors.DiffXUnknownOptionError – One or more attribute names are invalid.

generate_stats()#

Generate statistics for the DiffX metadata.

This will gather statistics on the number of changes, files, insertions, deletions, and total lines changed.

This should only be run once the diff is complete, before writing it.

to_bytes()#

Write and return the DiffX file contents.

Returns

The DiffX file contents.

Return type

bytes

Raises

pydiffx.errors.BaseDiffXError – There was an error generating the content.

class pydiffx.dom.objects.DiffXChangeSection(parent_section=None, **attrs)#

Bases: pydiffx.dom.properties.ContainerOptionsMixin, pydiffx.dom.properties.MetaOptionsMixin, pydiffx.dom.properties.PreambleOptionsMixin, pydiffx.dom.objects.BaseDiffXContainerSection

A change section within a DiffX file.

A change represents a set of changes to files, possibly backed by a commit.

Changes can be added through DiffX.add_change().

encoding#

The default encoding for preamble and metadata sections anywhere under this section.

This may be None, in which case the parent DiffX.encoding value will be used.

Changing this will not affect the in-memory representation of any data, but it will affect how it’s written.

Type

unicode

meta#

Metadata for the change.

Type

dict

meta_encoding#

Encoding used when reading/writing the metadata in the file.

See DiffXMetaSection.encoding.

Type

unicode

meta_section#

The actual metadata section. This will generally not be accessed directly.

Type

DiffXMetaSection

preamble#

The preamble content describing the change.

Type

unicode

preamble_encoding#

Encoding used when reading/writing the preamble content in the file.

See DiffXPreambleSection.encoding.

Type

unicode

preamble_indent#

Indentation applied to each line of preamble content.

See DiffXPreambleSection.indent.

Type

int

preamble_line_endings#

The type of line endings used in the preamble content.

See DiffXPreambleSection.line_endings.

Type

unicode

preamble_mimetype#

The mimetype representing the format of the preamble content.

See DiffXPreambleSection.mimetype.

Type

unicode

preamble_section#

The actual preamble section. This will generally not be accessed directly.

Type

DiffXPreambleSection

property subsections#

A list of the preamble, meta, and file subsections.

Type:

list of BaseDiffXSection

add_file(**attrs)#

Add a new file section.

Parameters

**attrs (dict) – Attributes to set on the section. This may consist of any attributes listed on DiffXFileSection.

Returns

The newly-added change section.

Return type

DiffXFileSection

Raises

pydiffx.errors.DiffXUnknownOptionError – One or more attribute names are invalid.

generate_stats()#

Generate statistics for the change section’s metadata.

This will gather statistics on the number of files, insertions, deletions, and total lines changed.

This should only be run once the change is complete. Normally, callers will want to call DiffX.generate_stats() instead.

class pydiffx.dom.objects.DiffXFileSection(parent_section=None, **attrs)#

Bases: pydiffx.dom.properties.ContainerOptionsMixin, pydiffx.dom.properties.DiffOptionsMixin, pydiffx.dom.properties.MetaOptionsMixin, pydiffx.dom.objects.BaseDiffXContainerSection

A file section within a change section.

A file represents a change to a particular file. This may be a change to the file contents, or just to the metadata of the file.

Metadata must always provide sufficient information for identifying and working with the file without having to parse the embedded diff.

Files can be added through DiffXChangeSection.add_file().

diff#

The file’s Unified Diff contents.

This may be a plain Unified Diff, or it may be a vendor-specific variant (such as a Git-style diff).

Type

bytes

diff_encoding#

The encoding of the diff content.

See DiffXFileDiffSection.encoding.

Type

unicode

diff_line_endings#

The identifier for the type of line endings (DOS or UNIX) separating each line of the diff content.

See DiffXFileDiffSection.line_endings.

Type

unicode

diff_section#

The actual diff section. This will generally not be accessed directly.

Type

DiffXFileDiffSection

diff_type#

The type of the diff (text or binary).

See DiffXFileDiffSection.type.

Type

unicode

encoding#

The default encoding for this section.

This is sort of redundant with meta_encoding, as the metadata is the only content section affected by this encoding. However, it’s here for consistency and future expansion.

This may be None, in which case the parent DiffXChange.encoding value will be used.

Changing this will not affect the in-memory representation of any data, but it will affect how it’s written.

Type

unicode

meta#

Metadata for the file.

Type

dict

meta_encoding#

The encoding used when reading/writing the metadata content in the file.

See DiffXMetaSection.encoding.

Type

unicode

meta_section#

The actual metadata section. This will generally not be accessed directly.

Type

DiffXMetaSection

generate_stats()#

Generate statistics for the file section’s metadata.

This will gather statistics on the number of insertions, deletions, and total lines changed.

Note that if the content in diff has a parse error, the data may be incorrect.

This should only be run once the change is complete. Normally, callers will want to call DiffX.generate_stats() instead.

class pydiffx.dom.objects.DiffXPreambleSection(**kwargs)#

Bases: pydiffx.dom.objects.BaseDiffXContentSection

A preamble section.

The contents and options for this section will generally be accessed through the parent section’s attributes.

data_type#

alias of str

encoding#

The encoding used when reading/writing the preamble content.

Changing this will not affect the in-memory representation of the preamble, but it will affect how it’s written.

If None, the encoding option of the section is used instead.

Type:

unicode

indent#

The indentation applied to each line of the preamble content.

This will be added to the beginning of each encoded line of the preamble when reading/writing the preamble content in the file.

Changing this will not affect the in-memory representation of the preamble, but it will affect how it’s written.

If None, no indentation will be applied.

It’s recommended to use an indentation of 4, to ensure preamble content does not impact parsing.

Type:

int

line_endings#

The type of line endings used in the preamble content.

Valid values are defined in LineEndings.

This should be explicitly set if the type of line endings are known, as a hint to parsers.

If None, parsers will need to carefully handle newline detection based on their needs.

Type:

unicode

mimetype#

The mimetype representing the format of the preamble content.

This can help consumers render the preamble content the way it was meant to be seen.

Valid values are defined in PreambleMimeType.

If None, the preamble content is assumed to be plain text.

Type:

unicode

class pydiffx.dom.objects.DiffXMetaSection(**kwargs)#

Bases: pydiffx.dom.objects.BaseDiffXContentSection

A metadata section.

The contents and options for this section will generally be accessed through the parent section’s attributes.

data_type#

alias of dict

encoding#

The encoding used when reading/writing the metadata content in the file.

Changing this will not affect the in-memory representation of the metadata, but it will affect how it’s written.

If None, the section’s encoding will be used instead.

Type:

unicode

format#

The metadata format used when reading/writing the content in the file.

This is available for future expansion. For now, it will always be JSON.

Type:

unicode

class pydiffx.dom.objects.DiffXFileDiffSection(**kwargs)#

Bases: pydiffx.dom.objects.BaseDiffXContentSection

A diff content section.

The contents and options for this section will generally be accessed through the parent section’s attributes.

data_type#

alias of bytes

encoding#

The encoding of the diff content.

This _does not_ inherit from any other section’s encoding. It must be explicitly provided for an encoding to be set.

It’s recommended that diff generators set this if they know the encoding of the file being changed.

If None, no encoding can be assumed.

Type:

unicode

line_endings#

The type of line endings used in the diff content.

Valid values are defined in LineEndings.

This should be explicitly set if the type of line endings are known, as a hint to parsers. Diffs may legitimately contain newline characters of an alternate type that are not intended to be interpreted as newlines. This hint can help avoid issues parsing those diffs.

If None, parsers will need to carefully handle newline detection based on their needs.

Type:

unicode

type#

The type of the diff (text or binary).

Valid values are defined in DiffType.

Type:

unicode