Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Framework (v5) #1119

Merged
merged 834 commits into from
Aug 24, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
834 commits
Select commit Hold shift + click to select a range
3eed273
Fixed tests
roll Jul 20, 2022
e19ebea
Fixed tests
roll Jul 20, 2022
df86768
Updated stats.bytes on reading
roll Jul 21, 2022
628e3dc
Support MD5 hash as well
roll Jul 21, 2022
33d2134
Updated errors
roll Jul 21, 2022
0016b54
Fixed tests
roll Jul 21, 2022
f3891d5
Renamed stats.time -> seconds
roll Jul 21, 2022
668b072
Removed resource.__iter__
roll Jul 22, 2022
229eb61
Fixed v0/url rule
roll Jul 22, 2022
82a0187
Create dialect/stats in the getters
roll Jul 22, 2022
3c63b54
Implemented program.convert (#1195)
shashigharti Jul 22, 2022
3b2b3b8
Update convert feature to the latest changes
roll Jul 22, 2022
1fea14e
Fixed program.convert not found logic
roll Jul 22, 2022
132ed07
Improved metadata logic
roll Jul 22, 2022
fdc1141
Support Parquet data format (#1186)
roll Jul 22, 2022
f8d9f02
Fixed ParquetControl
roll Jul 22, 2022
fdc103f
Added a issue-1205 test
roll Jul 24, 2022
99e5a11
Added doc blocks to the cell steps classes (#1204)
shashigharti Jul 24, 2022
99e591b
Fixed linting
roll Jul 24, 2022
5ef47b9
Fixed tests
roll Jul 24, 2022
647f516
Updated tests
roll Jul 24, 2022
5b703ae
Added dereference tests
roll Jul 24, 2022
84fcac8
Added safety tests
roll Jul 24, 2022
f611cb8
Fixed behaviour on emty resources
roll Jul 24, 2022
a5bee93
Fixed csv quote issue
roll Jul 24, 2022
47a18dd
Fixed more tests
roll Jul 24, 2022
6f560ec
Fixed more package tests
roll Jul 24, 2022
78f1cb8
Removed debug print
roll Jul 25, 2022
bc11044
Fixed type for html format
roll Jul 25, 2022
6963394
Updated xfails
roll Jul 25, 2022
42779be
Review actions' xfails
roll Jul 25, 2022
8960934
Updated xfails
roll Jul 25, 2022
7f3d34e
1205/recover failing tests (#1209)
shashigharti Jul 28, 2022
46a6c4b
Fixed linting
roll Jul 28, 2022
cf6d867
Fixed tests
roll Jul 28, 2022
6963dfd
Fixed custom-field
roll Jul 30, 2022
4765f6a
Enabled xpassed tests
roll Jul 30, 2022
8cfbc0a
Fixed extract-dialect
roll Jul 30, 2022
f36ce31
Recovered spss tests
roll Jul 30, 2022
955d00b
Fixed sync-schema
roll Jul 30, 2022
2d68f96
Recovered inquiry ci tests
roll Jul 30, 2022
0153e95
Fixed gsheets on ci
roll Jul 30, 2022
f441d3c
Fixed bigquery tests
roll Jul 30, 2022
f549a9f
Fixed bigquery on ci
roll Jul 30, 2022
e960666
Fixed bigquery on ci
roll Jul 30, 2022
1c5d5a5
Fixed bigquery on ci
roll Jul 30, 2022
f5e563b
Fixed bigquery on ci
roll Jul 30, 2022
917ee6d
Fixed bigquery on ci
roll Jul 30, 2022
2a7cb36
Skip ckan tests until 475 is done
roll Jul 30, 2022
346192b
Fixed report tests
roll Jul 30, 2022
28c8abb
Fixed linting
roll Jul 30, 2022
2234146
Fixed linting
roll Jul 30, 2022
024ab1c
Updated skip reason
roll Jul 30, 2022
978ef9f
Fixed tests
roll Jul 30, 2022
7a8798b
Skip bigquery tests
roll Jul 30, 2022
b479acc
Handled TODOs in resource tests
roll Jul 30, 2022
6ae6b4e
Refactored metadata.metadata_validate
roll Jul 30, 2022
9e56cf4
Implemented resource.profiles
roll Jul 30, 2022
75c877c
Implemented package.profiles
roll Jul 30, 2022
26dc6be
Fixed package-zip
roll Jul 30, 2022
b1ae724
Fixed json-data
roll Jul 30, 2022
50c695c
Replaced docs TODO by NOTE
roll Jul 31, 2022
1d7a299
Fixed resource's safety
roll Jul 31, 2022
66792e6
Fixed package's safety
roll Jul 31, 2022
86b066d
Updated dereference test
roll Aug 1, 2022
960adb1
Recovered steps.field_update
roll Aug 1, 2022
a3f5980
Recovered field-update related tests
roll Aug 1, 2022
1316ec9
Removed outdated NOTEs
roll Aug 1, 2022
07b5e0c
Recovered steps.field_add
roll Aug 1, 2022
64db86a
Recovered steps.field_merge
roll Aug 2, 2022
268c1ff
Recovered steps.field_pack
roll Aug 2, 2022
5d37416
Recovered steps.field_split
roll Aug 2, 2022
f62fe1e
Recovered steps.resource_update
roll Aug 2, 2022
99dbcd4
Renamed metadata to descriptor in steps
roll Aug 2, 2022
41c1ac2
Recovered steps.row_sort
roll Aug 2, 2022
80c63ed
Recovered steps.row_subset
roll Aug 2, 2022
e83560e
Recovered steps.table_write
roll Aug 2, 2022
010211a
Skip steps.table_pivot
roll Aug 2, 2022
545b9fe
Recovered steps.table_attach
roll Aug 2, 2022
9f5bbc3
Recovered steps.table_diff
roll Aug 2, 2022
8773f9b
Recovered steps.table_intersect
roll Aug 2, 2022
70702da
Skip steps.table_merge tests
roll Aug 2, 2022
17f4320
Recovered steps.table_join
roll Aug 2, 2022
06216f6
Fixed dereference for metadata
roll Aug 3, 2022
c623dd4
Recovered resource's dereference
roll Aug 3, 2022
14369d2
Back to metadata.to_descriptor_source
roll Aug 3, 2022
ab71d1f
Recovered package's dereference
roll Aug 3, 2022
5d74554
Added new metadata validate API
roll Aug 3, 2022
0740be8
Use explicit terminology for legacy
roll Aug 3, 2022
2b40ced
Fixed mediatype
roll Aug 3, 2022
d213395
Make json non-tabular by default
roll Aug 3, 2022
0bb856c
Support non-tabular resource.data
roll Aug 3, 2022
eb53656
Minor resource code update
roll Aug 3, 2022
d60a1fa
Added fk issue
roll Aug 4, 2022
a573b44
Handle schema.fields not array
roll Aug 4, 2022
dd82cd9
Handle crictial descriptor errors in structure
roll Aug 5, 2022
377d07f
Fixed field validation
roll Aug 5, 2022
c8dca68
Catch unsupported field types
roll Aug 5, 2022
cfd9071
Removed incorrect tests
roll Aug 5, 2022
eb32b49
Fixed data is a string error catching
roll Aug 5, 2022
ea0356c
Bootstrapped metadata's API@2
roll Aug 5, 2022
e7ae629
Started rebasing Schema on new metadata API
roll Aug 5, 2022
bc348e1
Recovered schema tests
roll Aug 6, 2022
322b29e
Recovered fields tests
roll Aug 6, 2022
42c6fb6
Started resource migration
roll Aug 6, 2022
eb508dd
Recovered some resource tests
roll Aug 6, 2022
20dc6f2
Recovered schema/dialect tests
roll Aug 6, 2022
2bc8f3a
Recovered more resource tests
roll Aug 6, 2022
85c9054
Started recovering resource.validate
roll Aug 6, 2022
22af96e
Recovered resource tests
roll Aug 6, 2022
9f012ca
Rebased on metadata_specify
roll Aug 7, 2022
034af47
Moved metadata internals to helpers
roll Aug 7, 2022
8c11108
Migrated package
roll Aug 7, 2022
ce00ad6
Fixed linting
roll Aug 8, 2022
fc6c418
Added square brackets test
roll Aug 8, 2022
23a890b
Recovered resource tests
roll Aug 9, 2022
7673fbd
Recovered package/format tests partially
roll Aug 9, 2022
5e01691
Recovered actions.describe
roll Aug 9, 2022
866a81f
Recovered steps
roll Aug 9, 2022
791ad0a
Recovered Inquiry tests
roll Aug 9, 2022
7abcd5c
Fixed metadata tests
roll Aug 9, 2022
e517a43
Recovered actions
roll Aug 9, 2022
34d001d
Simplified actions code
roll Aug 9, 2022
7073437
Improved error catching in validate
roll Aug 9, 2022
e3c1d0c
Recovered error-catching tests partially
roll Aug 9, 2022
02bd1d2
Recovered format tests
roll Aug 9, 2022
5651b6c
Added more xfails
roll Aug 9, 2022
6bff6e1
Recovered program tests
roll Aug 9, 2022
a3d7618
Recovered package.validate tests
roll Aug 9, 2022
e56105d
Xfailed tests
roll Aug 9, 2022
3e0925d
Fixed report tests
roll Aug 9, 2022
9fe060a
Migrated Error to new metadata API
roll Aug 10, 2022
f6cd9c9
Removed reportTask.scope
roll Aug 10, 2022
78161f0
Fixed report's stats
roll Aug 10, 2022
4f5db1c
Removed system.create_field
roll Aug 10, 2022
478e987
Removed system.create_error
roll Aug 10, 2022
74805b1
Removed system.create_step
roll Aug 10, 2022
8c7e55b
Removed system.create_check
roll Aug 10, 2022
6113916
Removed system.create_control
roll Aug 10, 2022
8aa0e18
Renamed detect_field_candidates
roll Aug 10, 2022
bec561f
Recovered profiles steps
roll Aug 10, 2022
e68fe39
Rebased on exception.to_errors()
roll Aug 10, 2022
08cc4af
Recovered program.transform tests
roll Aug 10, 2022
aeaf155
Imporved transform comments
roll Aug 10, 2022
f86e97c
Fixed strict mode
roll Aug 10, 2022
3f9a221
Fixed zip tests
roll Aug 10, 2022
d86911b
Migrated on system.trusted
roll Aug 10, 2022
cdc4a2c
Recovered security tests
roll Aug 10, 2022
621228b
Fixed package.to_zip
roll Aug 10, 2022
cdf9103
Rebased on system.use_context
roll Aug 11, 2022
991bd63
Rebased on system.onerror
roll Aug 11, 2022
3756fc2
Removed package.detector
roll Aug 11, 2022
7ee07c2
Improved metadata.metadata_import
roll Aug 11, 2022
b5d5f8d
Fixed dereference tests
roll Aug 11, 2022
7c45e45
Reworked basepath logic
roll Aug 11, 2022
6689b0f
Minor improvements
roll Aug 11, 2022
f23bd64
Fixed print(report)
roll Aug 11, 2022
0b30eba
Added resource.write(control)
roll Aug 11, 2022
edc446d
Merge branch 'main' into v5
roll Aug 11, 2022
beacb58
Fixed deps
roll Aug 11, 2022
2177fa2
Fixed tests
roll Aug 11, 2022
23e0689
Merge branch 'main' into v5
roll Aug 12, 2022
1dc98b5
Updted dependencies
roll Aug 12, 2022
3813c93
Bootstrapp the new docs
roll Aug 12, 2022
8226629
Adde first guides
roll Aug 12, 2022
7256323
Added more pages
roll Aug 12, 2022
15435b5
Added more docs
roll Aug 12, 2022
5ef4d46
Migrated formats
roll Aug 12, 2022
c84160e
Migrated schemes
roll Aug 12, 2022
e604a6b
Updated navigation
roll Aug 12, 2022
e0e2781
Added error docs
roll Aug 12, 2022
f9e2734
Reorganized errors
roll Aug 12, 2022
7615938
Reorganized fields
roll Aug 12, 2022
8ef46c4
Added universe/blog
roll Aug 12, 2022
44b98c9
Renamed section
roll Aug 12, 2022
ab90c2a
Migrated describe
roll Aug 13, 2022
20f9fda
Migrated extracting
roll Aug 13, 2022
14e78aa
Migrated validating
roll Aug 13, 2022
17fcdc8
Migrated transforming
roll Aug 13, 2022
65958d6
Migrated checks
roll Aug 13, 2022
f5e0ccd
Updated gitignore
roll Aug 13, 2022
11cd11e
Added link to v4 docs
roll Aug 13, 2022
458451e
Migrated steps
roll Aug 13, 2022
7a26af4
Migrated framework docs
roll Aug 13, 2022
9697f9a
Added missing doc warnings
roll Aug 13, 2022
9ee3f10
Migrated extracting to scripts
roll Aug 13, 2022
3cb9ae3
Started using script's output prop
roll Aug 13, 2022
5cfc161
Updated gitignore
roll Aug 14, 2022
142fd82
Added h3 to topics
roll Aug 14, 2022
cfe58cb
Rebased on from prop in pages
roll Aug 16, 2022
b2fe4d7
Added license to the docs
roll Aug 16, 2022
126f916
Enabled search
roll Aug 16, 2022
2f84c1d
Rebased getting-started on tabs
roll Aug 16, 2022
796d3b8
Updated basic-examples
roll Aug 17, 2022
77babfe
Started updating describing guide
roll Aug 17, 2022
1ff85d2
Updated describing guide
roll Aug 17, 2022
7c647c9
Updated extracting-data
roll Aug 17, 2022
852dfb8
Migrated validating-data
roll Aug 17, 2022
dc8673c
Migrated transforming-data
roll Aug 17, 2022
1b156ba
Updated running-cli/api
roll Aug 18, 2022
dfc146b
Updated package-class
roll Aug 18, 2022
cbd5cc2
Updated resource-class
roll Aug 18, 2022
ff4d0fb
Moved header/row docs
roll Aug 18, 2022
885f341
Updated resource-class
roll Aug 18, 2022
c66779d
Updated structure
roll Aug 18, 2022
0297cc6
Updated schema docs
roll Aug 18, 2022
a02df48
Updated classes docs
roll Aug 18, 2022
8e29f80
Updated detector topics
roll Aug 18, 2022
88c7426
Updated detector docs
roll Aug 18, 2022
4cc4ecc
Added errors docs
roll Aug 18, 2022
ed2fb6d
Updated docs for cell checks
roll Aug 18, 2022
faa0ecd
Updated docs for row checks
roll Aug 18, 2022
31f566e
Updated docs for table checks
roll Aug 18, 2022
c5460c7
Updated docs for baseline checks
roll Aug 18, 2022
95ee09f
Migrated docs for resource steps
roll Aug 19, 2022
9b62485
Improved remarks
roll Aug 19, 2022
47978c9
Migrated docs for table steps
roll Aug 19, 2022
cc14417
Migrated docs for field steps
roll Aug 19, 2022
eb1f7c3
Migrated docs for row steps
roll Aug 19, 2022
34f48cb
Migrated docs for cell steps
roll Aug 19, 2022
4143b41
Migrated schemes docs
roll Aug 19, 2022
8ba785e
Migrated formats docs
roll Aug 19, 2022
201f878
Added a docstring
roll Aug 19, 2022
89898c9
Added references
roll Aug 20, 2022
759057b
Updated steps docs
roll Aug 21, 2022
7e3e43d
Updated checks docs
roll Aug 21, 2022
02ba9be
Added data actions docs
roll Aug 21, 2022
e6e5d82
Added catalog docs
roll Aug 21, 2022
1e9a6a6
Improved framework docs
roll Aug 21, 2022
5aabc05
Merge branch 'main' into v5
roll Aug 22, 2022
0e95139
Merge branch 'v5' into v5-docs
roll Aug 22, 2022
0db83a7
Fixed failing tests (#1234)
shashigharti Aug 22, 2022
4568a73
Added fields docs
roll Aug 22, 2022
1866cc1
Updated architecture docs
roll Aug 22, 2022
1b9fa44
Updated CONTRIBUTING doc
roll Aug 22, 2022
dab58b2
Updated universe doc
roll Aug 22, 2022
003a45d
Added '--pre' to installation instructions
roll Aug 22, 2022
a206b5f
Recovered metadata.to_dict for compat
roll Aug 22, 2022
9c11a99
Removed JavaScript dependency
roll Aug 22, 2022
d99fe79
Renamed pages to docs
roll Aug 22, 2022
2ff35ac
Updated data
roll Aug 22, 2022
3093c53
Rebased on updated reference
roll Aug 23, 2022
557bf4f
Updated migration guide
roll Aug 23, 2022
b47c7b1
Merge pull request #1235 from frictionlessdata/v5-docs
roll Aug 23, 2022
c1df4a4
Fixed tests
roll Aug 23, 2022
133c049
Fixed Schema.from_jsonschema
roll Aug 23, 2022
9eb4dfa
Enabled zipped resource test
roll Aug 23, 2022
54e9aa8
Fixed linting
roll Aug 23, 2022
8022b2d
Use dirs in the blog
roll Aug 23, 2022
b953a54
Improved v5 blog
roll Aug 24, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Added doc blocks to the cell steps classes (#1204)
Added docblocks to steps classes
  • Loading branch information
shashigharti committed Jul 24, 2022
commit 99e5a113d825bd4dc5b3f9e12fc23ee17ac7bf2c
111 changes: 107 additions & 4 deletions frictionless/steps/cell/cell_convert.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,123 @@

@attrs.define(kw_only=True)
class cell_convert(Step):
"""Convert cell"""
"""Convert cell

Converts cell values of one or more fields using arbitrary functions, method
invocations or dictionary translations.

Parameters
----------
type : step identifier
value : value to replace in cells.
function : arbitrary function, data method or dictionary translations.
field_name : field name to apply function to.

Methods
-------
transform_resource(resource):
converts cell value of a resource using arbitrary function, data method
or dictionary translations

Examples
--------
>>> from frictionless import Resource, Pipeline, steps
>>> source = Resource(path="data/transform.csv")
+----+-----------+------------+
| id | name | population |
+====+===========+============+
| 1 | 'germany' | 83 |
+----+-----------+------------+
| 2 | 'france' | 66 |
+----+-----------+------------+
| 3 | 'spain' | 47 |
+----+-----------+------------+
>>> # replacing value
>>> pipeline = Pipeline(
steps=[
steps.cell_convert(field_name='population', value="100"),
],
)
>>> target = source.transform(pipeline)
>>> print(target.to_view())
+----+-----------+------------+
| id | name | population |
+====+===========+============+
| 1 | 'germany' | 100 |
+----+-----------+------------+
| 2 | 'france' | 100 |
+----+-----------+------------+
| 3 | 'spain' | 100 |
+----+-----------+------------+

>>> # using lamda function
>>> pipeline = Pipeline(
steps=[
steps.cell_convert(function=lambda v: v*2, field_name='population'),
],
)
>>> target = source.transform(pipeline)
>>> print(target.to_view())
+----+-----------+------------+
| id | name | population |
+====+===========+============+
| 1 | 'germany' | 100100 |
+----+-----------+------------+
| 2 | 'france' | 100100 |
+----+-----------+------------+
| 3 | 'spain' | 100100 |
+----+-----------+------------+

>>> # using method of data value
>>> pipeline = Pipeline(
steps=[
steps.cell_convert(function='upper', field_name='name'),
],
)
>>> target = source.transform(pipeline)
>>> print(target.to_view())
+----+-----------+------------+
| id | name | population |
+====+===========+============+
| 1 | 'GERMANY' | 100100 |
+----+-----------+------------+
| 2 | 'FRANCE' | 100100 |
+----+-----------+------------+
| 3 | 'SPAIN' | 100100 |
+----+-----------+------------+

>>> # using a dictionary
>>> pipeline = Pipeline(
steps=[
steps.cell_convert(field_name='name', function = {'GERMANY': 'Z', 'B': 'Y'}),
],
)
>>> target = source.transform(pipeline)
>>> print(target.to_view())
+----+----------+------------+-----------------+
| id | name | population | avg_age |
+====+==========+============+=================+
| 1 | 'Z' | 100100 | Decimal('30.5') |
+----+----------+------------+-----------------+
| 2 | 'FRANCE' | 100100 | Decimal('30.0') |
+----+----------+------------+-----------------+
| 3 | 'SPAIN' | 100100 | Decimal('40.0') |
+----+----------+------------+-----------------+

"""

type = "cell-convert"

# State

value: Optional[Any] = None
"""TODO: add docs"""
"""Value to replace in the field cell"""

function: Optional[Any] = None
"""TODO: add docs"""
"""Function/Data method/Dictionary to apply to the column"""

field_name: Optional[str] = None
"""TODO: add docs"""
"""Name of the field to apply the function"""

# Transform

Expand Down
117 changes: 113 additions & 4 deletions frictionless/steps/cell/cell_fill.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,129 @@

@attrs.define(kw_only=True)
class cell_fill(Step):
"""Fill cell"""
"""Fill cell

Replaces missing values with non-missing values from the adjacent row/column.

Parameters
----------
type : step identifier
value : value to replace in cells.
field_name : field name to apply function to.
direction: column/row direction from where to copy the non-missing value.
for column cells, it also checks for the field types and only works
if the two columns are of same types.


Methods
-------
transform_resource(resource):
replaces cell value of a resource using adjacent row/column value or using
user defined value.

Examples
--------
>>> from frictionless import Resource, Pipeline, steps
>>> source = Resource(path="data/transform-missing.csv")
+----+-----------+-----------+-----------------+-----------------+
| id | name | title | population | avg_age |
+====+===========+===========+=================+=================+
| 1 | 'germany' | 'germany' | Decimal('30.5') | None |
+----+-----------+-----------+-----------------+-----------------+
| 2 | None | 'italy' | Decimal('66') | Decimal('35.0') |
+----+-----------+-----------+-----------------+-----------------+
| 3 | 'spain' | 'spain' | Decimal('50.0') | None |
+----+-----------+-----------+-----------------+-----------------+
>>> # replacing missing values by user defined value
>>> pipeline = Pipeline(
steps=[
steps.table_normalize(),
steps.cell_fill(field_name="name", value="france"),
],
)
>>> target = source.transform(pipeline)
>>> print(target.to_view())
+----+-----------+-----------+-----------------+-----------------+
| id | name | title | population | avg_age |
+====+===========+===========+=================+=================+
| 1 | 'germany' | 'germany' | Decimal('30.5') | None |
+----+-----------+-----------+-----------------+-----------------+
| 2 | 'france' | 'italy' | Decimal('66') | Decimal('35.0') |
+----+-----------+-----------+-----------------+-----------------+
| 3 | 'spain' | 'spain' | Decimal('50.0') | None |
+----+-----------+-----------+-----------------+-----------------+

>>> # using non-missing value from the row above
>>> pipeline = pipeline = Pipeline(
steps=[
steps.table_normalize(),
steps.cell_fill(direction="down"),
],
)
>>> target = source.transform(pipeline)
>>> print(target.to_view())
+----+-----------+-----------+-----------------+-----------------+
| id | name | title | population | avg_age |
+====+===========+===========+=================+=================+
| 1 | 'germany' | 'germany' | Decimal('30.5') | None |
+----+-----------+-----------+-----------------+-----------------+
| 2 | 'france' | 'italy' | Decimal('66') | Decimal('35.0') |
+----+-----------+-----------+-----------------+-----------------+
| 3 | 'spain' | 'spain' | Decimal('50.0') | Decimal('35.0') |
+----+-----------+-----------+-----------------+-----------------+

>>> # using non-missing value from the right column
>>> pipeline = Pipeline(
steps=[
steps.table_normalize(),
steps.cell_fill(direction="left"),
],
)
>>> target = source.transform(pipeline)
>>> print(target.to_view())
+----+-----------+-----------+-----------------+-----------------+
| id | name | title | population | avg_age |
+====+===========+===========+=================+=================+
| 1 | 'germany' | 'germany' | Decimal('30.5') | None |
+----+-----------+-----------+-----------------+-----------------+
| 2 | 'france' | 'italy' | Decimal('66') | Decimal('35.0') |
+----+-----------+-----------+-----------------+-----------------+
| 3 | 'spain' | 'spain' | Decimal('50.0') | Decimal('35.0') |
+----+-----------+-----------+-----------------+-----------------+

>>> # using non-missing value from the left column
>>> pipeline = Pipeline(
steps=[
steps.table_normalize(),
steps.cell_fill(direction="right"),
],
)
>>> target = source.transform(pipeline)
>>> print(target.to_view())
+----+-----------+-----------+-----------------+-----------------+
| id | name | title | population | avg_age |
+====+===========+===========+=================+=================+
| 1 | 'germany' | 'germany' | Decimal('30.5') | Decimal('30.5') |
+----+-----------+-----------+-----------------+-----------------+
| 2 | 'france' | 'italy' | Decimal('66') | Decimal('35.0') |
+----+-----------+-----------+-----------------+-----------------+
| 3 | 'spain' | 'spain' | Decimal('50.0') | Decimal('35.0') |
+----+-----------+-----------+-----------------+-----------------+

"""

type = "cell-fill"

# State

value: Optional[Any] = None
"""TODO: add docs"""
"""Value to replace in the field cell with missing value"""

field_name: Optional[str] = None
"""TODO: add docs"""
"""Name of the field to replace the missing value cells"""

direction: Optional[str] = None
"""TODO: add docs"""
"""Directions to read the non missing value from(left/right/above)"""

# Transform

Expand Down
71 changes: 68 additions & 3 deletions frictionless/steps/cell/cell_format.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,82 @@

@attrs.define(kw_only=True)
class cell_format(Step):
"""Format cell"""
"""Format cell

Formats all values in the given or all string fields using the `template` format string.

Parameters
----------
type : step identifier
template: template to apply format string.
field_name : field name to apply the format.

Methods
-------
transform_resource(resource):
format cells of given or all string fields using user defined template.

Examples
--------
>>> from frictionless import Resource, Pipeline, steps
>>> source = Resource(path="data/transform-string.csv")
>>> print(source.to_view())
+-----------+--------------+--------------+
| name | country_code | city |
+===========+==============+==============+
| 'germany' | 'DE' | 'berlin' |
+-----------+--------------+--------------+
| 'denmark' | 'DK' | 'copenhagen' |
+-----------+--------------+--------------+
| 'spain' | 'ES' | 'Andalusia' |
+-----------+--------------+--------------+
>>> # apply format to a specific field
>>> pipeline = Pipeline(
steps=[
steps.cell_format(template="Prefix: {0}", field_name="name"),
],
)
>>> target = source.transform(pipeline)
>>> print(target.to_view())
+-------------------+--------------+--------------+
| name | country_code | city |
+===================+==============+==============+
| 'Prefix: germany' | 'DE' | 'berlin' |
+-------------------+--------------+--------------+
| 'Prefix: denmark' | 'DK' | 'copenhagen' |
+-------------------+--------------+--------------+
| 'Prefix: spain' | 'ES' | 'Andalusia' |
+-------------------+--------------+--------------+

>>> # apply format to all fields
>>> pipeline = Pipeline(
steps=[
steps.cell_format(template="Prefix: {0}"),
],
)
>>> target = source.transform(pipeline)
>>> print(target.to_view())
+-------------------+--------------+----------------------+
| name | country_code | city |
+===================+==============+======================+
| 'Prefix: germany' | 'Prefix: DE' | 'Prefix: berlin' |
+-------------------+--------------+----------------------+
| 'Prefix: denmark' | 'Prefix: DK' | 'Prefix: copenhagen' |
+-------------------+--------------+----------------------+
| 'Prefix: spain' | 'Prefix: ES' | 'Prefix: Andalusia' |
+-------------------+--------------+----------------------+

"""

type = "cell-format"

# State

template: str
"""TODO: add docs"""
"""format string to apply to cells"""

field_name: Optional[str] = None
"""TODO: add docs"""
"""field name to apply template format"""

# Transform

Expand Down
Loading