bubbles (brewery2) - operations

3
Bubbles Operations For Bubbles v0.1, June 2013 Operation Arguments Description Signatures Metadata operations eld_lter obj, keep, drop, rename Filters elds of an object. Keep – keep only listed elds, drop – keep all except elds in the drop list, rename – new eld names. rows sql Row operations lter_by_value obj, eld, value Get rows where eld is equal to value. rows sql lter_by_set obj, eld, set Get rows where eld is one of values from the set. rows lter_by_range obj, eld, from, to Get rows where eld is within given range. (not yet) lter_by_predicate obj, elds, predicate Get rows selected by the predicate. Predicate receives values for given elds. rows records distinct obj[, key] Distinct values for key elds rows sql rst_unique obj[, key][,discard] Every rst row with distinct value for key elds rows sample obj, value[, mode] Provide a sample of object’s rows based on mode. The mode might be: rst, nth, random. rows sql sort obj, order Returns object with rows ordered based on order. Order is a list of tuples (eld, order). rows sql aggregate obj, keys, measures, include_count Aggregate measures by keys rows Field Operations text_substitute obj, eld, substitions Perform substitutions (pattern, value) on eld. rows string_strip obj, [elds, [chars]] Strip whitespaces (or chars) from elds or all string and text elds. rows append_constant_elds obj, elds, values Appends elds to the object with specied constant values. rows sql dates_to_dimension obj, [elds, [unknown_date]] Changes specied elds (or all date elds) to a date dimension key in form YYYYMMDD. unknown_date value is used for empty date elds. rows sql data brewery Bubbles – operations Revision 1, June 2013, Bubbles 0.1 prototype

Upload: stefan-urbanek

Post on 03-Jan-2016

24.852 views

Category:

Documents


2 download

DESCRIPTION

List of operations in Bubbles (Brewery2) and their signatures that are implemented in the core package.

TRANSCRIPT

Page 1: Bubbles (Brewery2) - Operations

Bubbles OperationsFor Bubbles v0.1, June 2013

Operation Arguments Description Signatures

Metadata operationsMetadata operationsMetadata operationsMetadata operations

field_filter obj, keep, drop, rename

Filters fields of an object. Keep – keep only listed fields, drop – keep all except fields in the drop list, rename – new field names.

‣rows‣sql

Row operationsRow operationsRow operationsRow operations

filter_by_value obj, field, value Get rows where field is equal to value. ‣rows‣sql

filter_by_set obj, field, set Get rows where field is one of values from the set.

‣rows

filter_by_range obj, field, from, to Get rows where field is within given range. (not yet)

filter_by_predicate obj, fields, predicate Get rows selected by the predicate. Predicate receives values for given fields.

‣rows‣records

distinct obj[, key] Distinct values for key fields ‣rows‣sql

first_unique obj[, key][,discard] Every first row with distinct value for key fields ‣rows

sample obj, value[, mode] Provide a sample of object’s rows based on mode. The mode might be: first, nth, random.

‣rows‣sql

sort obj, order Returns object with rows ordered based on order. Order is a list of tuples (field, order).

‣rows‣sql

aggregate obj, keys, measures, include_count

Aggregate measures by keys ‣rows

Field OperationsField OperationsField OperationsField Operations

text_substitute obj, field, substitions Perform substitutions (pattern, value) on field. ‣rows

string_strip obj, [fields, [chars]] Strip whitespaces (or chars) from fields or all string and text fields.

‣rows

append_constant_fields obj, fields, values Appends fields to the object with specified constant values.

‣rows‣sql

dates_to_dimension obj, [fields, [unknown_date]]

Changes specified fields (or all date fields) to a date dimension key in form YYYYMMDD. unknown_date value is used for empty date fields.

‣rows‣sql

data brewery Bubbles – operations

Revision 1, June 2013, Bubbles 0.1 prototype

Page 2: Bubbles (Brewery2) - Operations

Operation Arguments Description Signatures

CompositionsCompositionsCompositionsCompositions

append objects[] Append objects with same fields ‣rows‣sql

join_details master, detail, master_key, detail_key

Composes master and detail objects using left (inner) join by matching master_key field(s) with detail_key field(s).

‣rows,rows‣sql,sql

added_keys dimension, source, dimension_key, source_key

Get keys that were added to the source if compared with dimension. Comparison is done on specified keys.

‣sql,sql

added_rows dimension, source, dimension_key, source_key

Get whole rows that were added to the source if compared with dimension. Comparison is done on specified keys.

‣sql,sql‣sql,rows

changed_rows dimension, source, dimension_key, source_key, fields, version_field

Get rows that were changed in the source (fields are compared for change). Row matching is done on specified keys.

‣sql,sql

AuditingAuditingAuditingAuditing

distinct_count obj[, fields] Count number of rows for distinct values of fields (or all fields)

‣sql

AssertionsAssertionsAssertionsAssertions

assert_unique obj[, key] There should be no row (or key) duplicates in the object.

‣sql

ConversionsConversionsConversionsConversions

as_dict obj, key, value Converts object to a python dictionary. ‣rows

as_records obj Return an object with records representation ‣rows‣sql

fetch_all obj Fetches (consumes) all rows into a list and returns an object with rows representation.

‣rows

OutputOutputOutputOutput

pretty_print obj, target Produces textual output to target (or stdout) formatted as table.

‣rows

Notes■ All objects with sql representation currently provide also rows representation. The

statements are executed (not necessarily fetched) and objects are handled as iterator objects. Therefore all rows operations can be used.

data brewery Bubbles – operations

Revision 1, June 2013, Bubbles 0.1 prototype

Page 3: Bubbles (Brewery2) - Operations

■ Assertions raise ProbeAssertionError on failure. Can be used in Pipelines to stop the process when condition is not met.

■ Most of the keys may be either a single fields or list of fields (composite keys)

data brewery Bubbles – operations

Revision 1, June 2013, Bubbles 0.1 prototype