bubbles (brewery2) - operations

Post on 03-Jan-2016

24.852 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

List of operations in Bubbles (Brewery2) and their signatures that are implemented in the core package.

TRANSCRIPT

Bubbles OperationsFor Bubbles v0.1, June 2013

Operation Arguments Description Signatures

Metadata operationsMetadata operationsMetadata operationsMetadata operations

field_filter obj, keep, drop, rename

Filters fields of an object. Keep – keep only listed fields, drop – keep all except fields in the drop list, rename – new field names.

‣rows‣sql

Row operationsRow operationsRow operationsRow operations

filter_by_value obj, field, value Get rows where field is equal to value. ‣rows‣sql

filter_by_set obj, field, set Get rows where field is one of values from the set.

‣rows

filter_by_range obj, field, from, to Get rows where field is within given range. (not yet)

filter_by_predicate obj, fields, predicate Get rows selected by the predicate. Predicate receives values for given fields.

‣rows‣records

distinct obj[, key] Distinct values for key fields ‣rows‣sql

first_unique obj[, key][,discard] Every first row with distinct value for key fields ‣rows

sample obj, value[, mode] Provide a sample of object’s rows based on mode. The mode might be: first, nth, random.

‣rows‣sql

sort obj, order Returns object with rows ordered based on order. Order is a list of tuples (field, order).

‣rows‣sql

aggregate obj, keys, measures, include_count

Aggregate measures by keys ‣rows

Field OperationsField OperationsField OperationsField Operations

text_substitute obj, field, substitions Perform substitutions (pattern, value) on field. ‣rows

string_strip obj, [fields, [chars]] Strip whitespaces (or chars) from fields or all string and text fields.

‣rows

append_constant_fields obj, fields, values Appends fields to the object with specified constant values.

‣rows‣sql

dates_to_dimension obj, [fields, [unknown_date]]

Changes specified fields (or all date fields) to a date dimension key in form YYYYMMDD. unknown_date value is used for empty date fields.

‣rows‣sql

data brewery Bubbles – operations

Revision 1, June 2013, Bubbles 0.1 prototype

Operation Arguments Description Signatures

CompositionsCompositionsCompositionsCompositions

append objects[] Append objects with same fields ‣rows‣sql

join_details master, detail, master_key, detail_key

Composes master and detail objects using left (inner) join by matching master_key field(s) with detail_key field(s).

‣rows,rows‣sql,sql

added_keys dimension, source, dimension_key, source_key

Get keys that were added to the source if compared with dimension. Comparison is done on specified keys.

‣sql,sql

added_rows dimension, source, dimension_key, source_key

Get whole rows that were added to the source if compared with dimension. Comparison is done on specified keys.

‣sql,sql‣sql,rows

changed_rows dimension, source, dimension_key, source_key, fields, version_field

Get rows that were changed in the source (fields are compared for change). Row matching is done on specified keys.

‣sql,sql

AuditingAuditingAuditingAuditing

distinct_count obj[, fields] Count number of rows for distinct values of fields (or all fields)

‣sql

AssertionsAssertionsAssertionsAssertions

assert_unique obj[, key] There should be no row (or key) duplicates in the object.

‣sql

ConversionsConversionsConversionsConversions

as_dict obj, key, value Converts object to a python dictionary. ‣rows

as_records obj Return an object with records representation ‣rows‣sql

fetch_all obj Fetches (consumes) all rows into a list and returns an object with rows representation.

‣rows

OutputOutputOutputOutput

pretty_print obj, target Produces textual output to target (or stdout) formatted as table.

‣rows

Notes■ All objects with sql representation currently provide also rows representation. The

statements are executed (not necessarily fetched) and objects are handled as iterator objects. Therefore all rows operations can be used.

data brewery Bubbles – operations

Revision 1, June 2013, Bubbles 0.1 prototype

■ Assertions raise ProbeAssertionError on failure. Can be used in Pipelines to stop the process when condition is not met.

■ Most of the keys may be either a single fields or list of fields (composite keys)

data brewery Bubbles – operations

Revision 1, June 2013, Bubbles 0.1 prototype

top related