forked from frictionlessdata/tableschema-r
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
822 lines (578 loc) · 30.2 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
---
title: <img src="okgr.png" align="right" width=100px /><img src="oklabs.png" align="right" width=100px /><br><br/><br/><img src="Ffrictionless.png" align="left" width=120 /><br/>rictionless Data - <br/>Table Schema
output:
github_document:
html_preview: no
number_sections: yes
---
[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/tableschema.r)](https://cran.r-project.org/package=tableschema.r)
[![R-CMD-check](https://github.com/frictionlessdata/tableschema-r/workflows/R-CMD-check/badge.svg)](https://github.com/frictionlessdata/tableschema-r/actions)
[![Coverage status](https://coveralls.io/repos/github/frictionlessdata/tableschema-r/badge.svg)](https://coveralls.io/r/frictionlessdata/tableschema-r?branch=master)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)
[![](http://cranlogs.r-pkg.org/badges/grand-total/tableschema.r)](http://cran.rstudio.com/web/packages/tableschema.r/index.html)
[![Licence](https://img.shields.io/badge/licence-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![Support](https://img.shields.io/badge/support-discord-brightgreen)](https://discordapp.com/invite/Sewv6av)
# Description
R library for working with [Table Schema][tableschema].
# Features
+ `Table` class for working with data and schema
+ `Schema` class for working with schemas
+ `Field` class for working with schema fields
+ `validate` function for validating schema descriptors
+ `infer` function that creates a schema based on a data sample
# Getting started
## Installation
In order to install the latest distribution of [R software][Rs] to your computer you have to select one of the mirror sites of the [Comprehensive R Archive Network][R], select the appropriate link for your operating system and follow the wizard instructions.
For windows users you can:
1. Go to CRAN
2. Click download R for Windows
3. Click Base (This is what you want to install R for the first time)
4. Download the latest R version
5. Run installation file and follow the instrustions of the installer.
(Mac) OS X and Linux users may need to follow different steps depending on their system version to install R successfully and it is recommended to read the instructions on CRAN site carefully.
Even more detailed installation instructions can be found in [R Installation and Administration manual][Rman].
To install [RStudio][Rstudio], you can download [RStudio Desktop][Rstudiodown] with Open Source License and follow the wizard instructions:
1. Go to [RStudio](https://www.rstudio.com/products/rstudio/)
2. Click download on RStudio Desktop
3. Download on RStudio Desktop free download
4. Select the appropriate file for your system
5. Run installation file
To install the `tableschema` library it is necessary to install first `devtools` library to make installation of github libraries available.
```{r, eval=FALSE, include=T}
# Install devtools package if not already
install.packages("devtools")
```
Install `tableschema.r`
```{r, eval=FALSE, include=T}
# from CRAN version
install.packages("tableschema.r")
# or install the development version from github
devtools::install_github("frictionlessdata/tableschema-r")
```
## Load library
```{r echo=TRUE, message=FALSE, warning=FALSE}
# Install devtools package if not already
# install.packages("jsonlite")
library(jsonlite)
# Install devtools package if not already
# install.packages("future")
library(future)
# load the library using
library(tableschema.r)
```
# Documentation
[Jsonlite package](https://CRAN.R-project.org/package=jsonlite) is internally used to convert json data to list objects. The input parameters of functions could be json strings, files or lists and the outputs are in list format to easily further process your data in R environment and exported as desired. The examples below show how to use jsonlite package to convert the output back to json adding indentation whitespace. More details about handling json you can see jsonlite documentation or vignettes [here](https://CRAN.R-project.org/package=jsonlite).
Moreover [future package](https://CRAN.R-project.org/package=future) is also used to load and create Table and Schema classes asynchronously. To retrieve the actual result of the loaded Table or Schema you have to use `value(...)` to the variable you stored the loaded Table/Schema. More details about future package and sequential and parallel processing you can find [here](https://CRAN.R-project.org/package=future).
## Working with Table
A table is a core concept in a tabular data world. It represents a data with a metadata (Table Schema). Let's see how we could use it in practice.
Consider we have some local csv file. It could be inline data or remote link - all supported by `Table` class (except local files for in-brower usage of course). But say it's `data.csv` for now:
> data/cities.csv
```csv
city,location
london,"51.50,-0.11"
paris,"48.85,2.30"
rome,N/A
```
Let's create and read a table. We use static `Table.load` method and `table.read` method with a `keyed` option to get list of keyed rows:
```{r eval=TRUE, message=FALSE, warning=FALSE, include=TRUE, paged.print=TRUE}
def = Table.load('inst/extdata/data.csv')
table = value(def)
# add indentation whitespace to JSON output with jsonlite package
toJSON(table$read(keyed = TRUE), pretty = TRUE) # function from jsonlite package
```
```{r eval=TRUE, message=FALSE, warning=FALSE, include=TRUE, paged.print=TRUE}
table.headers = table$headers
table.headers
```
As we could see our locations are just a strings. But it should be geopoints. Also Rome's location is not available but it's also just a `N/A` string instead of `null`. First we have to infer Table Schema:
```{r eval=TRUE, message=FALSE, warning=FALSE, include=TRUE}
# add indentation whitespace to JSON output with jsonlite package
toJSON(table$infer(), pretty = TRUE) # function from jsonlite package
```
```{r, eval=TRUE, include=TRUE}
toJSON(table$schema$descriptor, pretty = TRUE) # function from jsonlite package
```
```{r eval=FALSE, message=TRUE,error=TRUE, warning=FALSE, include=TRUE}
table$read(keyed = TRUE) # Fails
```
Let's fix not available location. There is a `missingValues` property in Table Schema specification. As a first try we set `missingValues` to `N/A` in `table$schema$descriptor`. Schema descriptor could be changed in-place but all changes should be commited by `table$schema$commit()`:
```{r, eval = TRUE, include = TRUE, warning=FALSE}
table$schema$descriptor['missingValues'] = 'N/A'
table$schema$commit()
```
```{r, eval = TRUE, include = TRUE, warning=FALSE}
table$schema$valid # false
```
```{r, eval = TRUE, include = TRUE, warning=FALSE}
table$schema$errors
```
As a good citiziens we've decided to check out schema descriptor validity. And it's not valid! We sould use an list for `missingValues` property. Also don't forget to have an empty string as a missing value:
```{r, eval=TRUE, include=TRUE}
table$schema$descriptor[['missingValues']] = list("", 'N/A')
table$schema$commit()
table$schema$valid # true
```
All good. It looks like we're ready to read our data again:
```{r, eval=FALSE, include=TRUE}
table$read() # or
```
```{r, eval=FALSE, include=TRUE}
toJSON(table$read(), pretty = TRUE) # function from jsonlite package
```
Now we see that:
+ locations are lists with numeric lattide and longitude
+ Rome's location is `null`
And because there are no errors on data reading we could be sure that our data is valid againt our schema. Let's save it:
```{r, eval=FALSE, include=TRUE}
table$schema$save('schema.json')
table$save('data.csv')
```
Our `data.csv` looks the same because it has been stringified back to `csv` format. But now we have `schema.json`:
```json
{
"fields": [
{
"name": "city",
"type": "string",
"format": "default"
},
{
"name": "location",
"type": "geopoint",
"format": "default"
}
],
"missingValues": [
"",
"N/A"
]
}
```
If we decide to improve it even more we could update the schema file and then open it again. But now providing a schema path.
```{r, eval=TRUE, include=TRUE}
def = Table.load('inst/extdata/data.csv', schema = 'inst/extdata/schema.json')
table = value(def)
table
```
It was only basic introduction to the `Table` class. To learn more let's take a look on `Table` class API reference.
## Working with Schema
A model of a schema with helpful methods for working with the schema and supported data. Schema instances can be initialized with a schema source as a url to a JSON file or a JSON object. The schema is initially validated (see [validate](#validate) below). By default validation errors will be stored in `schema$errors` but in a strict mode it will be instantly raised.
Let's create a blank schema. It's not valid because `descriptor$fields` property is required by the [Table Schema](http://specs.frictionlessdata.io/table-schema/) specification:
```{r, eval=TRUE, include=TRUE}
def = Schema.load({})
schema = value(def)
schema$valid # false
schema$errors
```
To do not create a schema descriptor by hands we will use a `schema$infer` method to infer the descriptor from given data:
```{r, eval=TRUE, include=TRUE}
toJSON(
schema$infer(helpers.from.json.to.list('[
["id", "age", "name"],
["1","39","Paul"],
["2","23","Jimmy"],
["3","36","Jane"],
["4","28","Judy"]
]')), pretty = TRUE) # function from jsonlite package
```
```{r, eval=TRUE, include=TRUE}
schema$valid # true
```
```{r, eval=TRUE, include=TRUE}
toJSON(
schema$descriptor,
pretty = TRUE) # function from jsonlite package
```
Now we have an inferred schema and it's valid. We could cast data row against our schema. We provide a string input by an output will be cast correspondingly:
```{r, eval=TRUE, include=TRUE}
toJSON(
schema$castRow(helpers.from.json.to.list('["5", "66", "Sam"]')),
pretty = TRUE, auto_unbox = TRUE) # function from jsonlite package
```
But if we try provide some missing value to `age` field cast will fail because for now only one possible missing value is an empty string. Let's update our schema:
```{r, eval=TRUE,error= TRUE, include=TRUE}
schema$castRow(helpers.from.json.to.list('["6", "N/A", "Walt"]'))
# Cast error
```
```{r, eval=TRUE, include=TRUE}
schema$descriptor$missingValues = list('', 'NA')
schema$commit()
```
```{r, eval=TRUE, include=TRUE}
schema$castRow(helpers.from.json.to.list('["6", "", "Walt"]'))
```
We could save the schema to a local file. And we could continue the work in any time just loading it from the local file:
```{r, eval=FALSE, include=TRUE}
schema$save('schema.json')
schema = Schema.load('schema.json')
```
It was only basic introduction to the `Schema` class. To learn more let's take a look on `Schema` class API reference.
## Working with Field
Class represents field in the schema.
Data values can be cast to native R types. Casting a value will check the value is of the expected type, is in the correct format, and complies with any constraints imposed by a schema.
```json
{
"name": "birthday",
"type": "date",
"format": "default",
"constraints": {
"required": true,
"minimum": "2015-05-30"
}
}
```
Following code will not raise the exception, despite the fact our date is less than minimum constraints in the field, because we do not check constraints of the field descriptor
```{r, eval=TRUE, include=TRUE}
field = Field$new(helpers.from.json.to.list('{"name": "name", "type": "number"}'))
dateType = field$cast_value('12345') # cast
dateType # print the result
```
And following example will raise exception, because we set flag 'skip constraints' to `false`, and our date is less than allowed by `minimum` constraints of the field. Exception will be raised as well in situation of trying to cast non-date format values, or empty values
```{r, eval=TRUE, error=TRUE, include=TRUE}
tryCatch(
dateType = field$cast_value(value = '2014-05-29', constraints = FALSE),
error = function(e){# uh oh, something went wrong
})
```
Values that can't be cast will raise an `Error` exception. Casting a value that doesn't meet the constraints will raise an `Error` exception.
Table below shows the available types, formats and resultant value of the cast:
+-----------+-----------------------------+------------------+
| Type | Formats | Casting result |
+:==========+:============================+:=================+
| any | default | Any |
+-----------+-----------------------------+------------------+
| list | default | List |
+-----------+-----------------------------+------------------+
| boolean | default | Boolean |
+-----------+-----------------------------+------------------+
| date | default, any | Date |
+-----------+-----------------------------+------------------+
| datetime | default, any | Date |
+-----------+-----------------------------+------------------+
| duration | default | Duration |
+-----------+-----------------------------+------------------+
| geojson | default, topojson | Object |
+-----------+-----------------------------+------------------+
| geopoint | default, list, object | [Number, Number] |
+-----------+-----------------------------+------------------+
| integer | default | Number |
+-----------+-----------------------------+------------------+
| number | default | Number |
+-----------+-----------------------------+------------------+
| object | default | Object |
+-----------+-----------------------------+------------------+
| string | default, uri, email, binary | String |
+-----------+-----------------------------+------------------+
| time | default, any | Date |
+-----------+-----------------------------+------------------+
| year | default | Number |
+-----------+-----------------------------+------------------+
| yearmonth | default | [Number, Number] |
+-----------+-----------------------------+------------------+
### Working with Validate
> `validate()` validates whether a **schema** is a validate Table Schema accordingly to the [specifications](https://frictionlessdata.io/schemas/table-schema.json). It does **not** validate data against a schema.
Given a schema descriptor `validate` returns a validation object:
```{r, eval=TRUE, include=TRUE}
valid_errors = validate('inst/extdata/schema.json')
valid_errors
```
### Working with Infer
Given data source and headers `infer` will return a Table Schema as a JSON object based on the data values.
Given the data file, example.csv:
```csv
id,age,name
1,39,Paul
2,23,Jimmy
3,36,Jane
4,28,Judy
```
Call `infer` with headers and values from the datafile:
```{r, eval=TRUE, include=TRUE}
descriptor = infer('inst/extdata/data_infer.csv')
```
The `descriptor` variable is now a list object that can easily converted to JSON:
```{r, eval=TRUE, include=TRUE}
toJSON(
descriptor,
pretty = TRUE
) # function from jsonlite package
```
## API Reference
### Table
Table representation
* [Table](#Table)
* _instance_
* [$headers](#Table+headers) ⇒ <code>List.<string></code>
* [$schema](#Table+schema) ⇒ <code>Schema</code>
* [$iter(keyed, extended, cast, forceCast, relations, stream)](#Table+iter) ⇒ <code>AsyncIterator</code> \| <code>Stream</code>
* [$read(limit)](#Table+read) ⇒ <code>List.<List></code> \| <code>List.<Object></code>
* [$infer(limit)](#Table+infer) ⇒ <code>Object</code>
* [$save(target)](#Table+save) ⇒ <code>Boolean</code>
* _static_
* [.load(source, schema, strict, headers, parserOptions)](#Table.load) ⇒ [<code>Table</code>](#Table)
#### table$headers ⇒ <code>List.<string></code>
Headers
**Returns**: <code>List.<string></code> - data source headers
#### table$schema ⇒ <code>Schema</code>
Schema
**Returns**: <code>Schema</code> - table schema instance
#### table$iter(keyed, extended, cast, forceCast, relations, stream) ⇒ <code>AsyncIterator</code> \| <code>Stream</code>
Iterate through the table data
And emits rows cast based on table schema (async for loop).
With a `stream` flag instead of async iterator a Node stream will be returned.
Data casting can be disabled.
**Returns**: <code>AsyncIterator</code> \| <code>Stream</code> - async iterator/stream of rows:
- `[value1, value2]` - base
- `{header1: value1, header2: value2}` - keyed
- `[rowNumber, [header1, header2], [value1, value2]]` - extended
**Throws**:
- <code>TableSchemaError</code> raises any error occurred in this process
| Param | Type | Description |
| --- | --- | --- |
| keyed | <code>boolean</code> | iter keyed rows |
| extended | <code>boolean</code> | iter extended rows |
| cast | <code>boolean</code> | disable data casting if false |
| forceCast | <code>boolean</code> | instead of raising on the first row with cast error return an error object to replace failed row. It will allow to iterate over the whole data file even if it's not compliant to the schema. Example of output stream: `[['val1', 'val2'], TableSchemaError, ['val3', 'val4'], ...]` |
| relations | <code>Object</code> | object of foreign key references in a form of `{resource1: [{field1: value1, field2: value2}, ...], ...}`. If provided foreign key fields will checked and resolved to its references |
| stream | <code>boolean</code> | return Node Readable Stream of table rows |
#### table$read(limit) ⇒ <code>List.<List></code> \| <code>List.<Object></code>
Read the table data into memory
> The API is the same as `table.iter` has except for:
**Returns**: <code>List.<List></code> \| <code>List.<Object></code> - list of rows:
- `[value1, value2]` - base
- `{header1: value1, header2: value2}` - keyed
- `[rowNumber, [header1, header2], [value1, value2]]` - extended
| Param | Type | Description |
| --- | --- | --- |
| limit | <code>integer</code> | limit of rows to read |
#### table$infer(limit) ⇒ <code>Object</code>
Infer a schema for the table.
It will infer and set Table Schema to `table.schema` based on table data.
**Returns**: <code>Object</code> - Table Schema descriptor
| Param | Type | Description |
| --- | --- | --- |
| limit | <code>number</code> | limit rows sample size |
#### table$save(target) ⇒ <code>Boolean</code>
Save data source to file locally in CSV format with `,` (comma) delimiter
**Returns**: <code>Boolean</code> - true on success
**Throws**:
- <code>TableSchemaError</code> an error if there is saving problem
| Param | Type | Description |
| --- | --- | --- |
| target | <code>string</code> | path where to save a table data |
#### Table.load(source, schema, strict, headers, parserOptions) ⇒ [<code>Table</code>](#Table)
Factory method to instantiate `Table` class.
This method is async and it should be used with await keyword or as a `Promise`.
If `references` argument is provided foreign keys will be checked
on any reading operation.
**Returns**: [<code>Table</code>](#Table) - data table class instance
**Throws**:
- <code>TableSchemaError</code> raises any error occurred in table creation process
| Param | Type | Description |
| --- | --- | --- |
| source | <code>string</code> \| <code>List.<List></code> \| <code>Stream</code> \| <code>function</code> | data source (one of): - local CSV file (path) - remote CSV file (url) - list of lists representing the rows - readable stream with CSV file contents - function returning readable stream with CSV file contents |
| schema | <code>string</code> \| <code>Object</code> | data schema in all forms supported by `Schema` class |
| strict | <code>boolean</code> | strictness option to pass to `Schema` constructor |
| headers | <code>number</code> \| <code>List.<string></code> | data source headers (one of): - row number containing headers (`source` should contain headers rows) - list of headers (`source` should NOT contain headers rows) |
| parserOptions | <code>Object</code> | options to be used by CSV parser. All options listed at <https://csv.js.org/parse/options/>. By default `ltrim` is true according to the CSV Dialect spec. |
### Schema
Schema representation
* [Schema](#Schema)
* _instance_
* [$valid](#Schema+valid) ⇒ <code>Boolean</code>
* [$errors](#Schema+errors) ⇒ <code>List.<Error></code>
* [$descriptor](#Schema+descriptor) ⇒ <code>Object</code>
* [$primaryKey](#Schema+primaryKey) ⇒ <code>List.<string></code>
* [$foreignKeys](#Schema+foreignKeys) ⇒ <code>List.<Object></code>
* [$fields](#Schema+fields) ⇒ <code>List.<Field></code>
* [$fieldNames](#Schema+fieldNames) ⇒ <code>List.<string></code>
* [$getField(fieldName)](#Schema+getField) ⇒ <code>Field</code> \| <code>null</code>
* [$addField(descriptor)](#Schema+addField) ⇒ <code>Field</code>
* [$removeField(name)](#Schema+removeField) ⇒ <code>Field</code> \| <code>null</code>
* [$castRow(row, failFalst)](#Schema+castRow) ⇒ <code>List.<List></code>
* [$infer(rows, headers)](#Schema+infer) ⇒ <code>Object</code>
* [$commit(strict)](#Schema+commit) ⇒ <code>Boolean</code>
* [$save(target)](#Schema+save) ⇒ <code>boolean</code>
* _static_
* [.load(descriptor, strict)](#Schema.load) ⇒ [<code>Schema</code>](#Schema)
#### schema$valid ⇒ <code>Boolean</code>
Validation status
It always `true` in strict mode.
**Returns**: <code>Boolean</code> - returns validation status
#### schema$errors ⇒ <code>List.<Error></code>
Validation errors
It always empty in strict mode.
**Returns**: <code>List.<Error></code> - returns validation errors
#### schema$descriptor ⇒ <code>Object</code>
Descriptor
**Returns**: <code>Object</code> - schema descriptor
#### schema$primaryKey ⇒ <code>List.<string></code>
Primary Key
**Returns**: <code>List.<string></code> - schema primary key
#### schema$foreignKeys ⇒ <code>List.<Object></code>
Foreign Keys
**Returns**: <code>List.<Object></code> - schema foreign keys
#### schema$fields ⇒ <code>List.<Field></code>
Fields
**Returns**: <code>List.<Field></code> - schema fields
#### schema$fieldNames ⇒ <code>List.<string></code>
Field names
**Returns**: <code>List.<string></code> - schema field names
#### schema$getField(fieldName) ⇒ <code>Field</code> \| <code>null</code>
Return a field
**Returns**: <code>Field</code> \| <code>null</code> - field instance if exists
| Param | Type |
| --- | --- |
| fieldName | <code>string</code> |
#### schema$addField(descriptor) ⇒ <code>Field</code>
Add a field
**Returns**: <code>Field</code> - added field instance
| Param | Type |
| --- | --- |
| descriptor | <code>Object</code> |
#### schema$removeField(name) ⇒ <code>Field</code> \| <code>null</code>
Remove a field
**Returns**: <code>Field</code> \| <code>null</code> - removed field instance if exists
| Param | Type |
| --- | --- |
| name | <code>string</code> |
#### schema$castRow(row, failFalst) ⇒ <code>List.<List></code>
Cast row based on field types and formats.
**Returns**: <code>List.<List></code> - cast data row
| Param | Type | Description |
| --- | --- | --- |
| row | <code>List.<List></code> | data row as an list of values |
| failFalst | <code>boolean</code> | |
#### schema$infer(rows, headers) ⇒ <code>Object</code>
Infer and set `schema.descriptor` based on data sample.
**Returns**: <code>Object</code> - Table Schema descriptor
| Param | Type | Description |
| --- | --- | --- |
| rows | <code>List.<List></code> | list of lists representing rows |
| headers | <code>integer</code> \| <code>List.<string></code> | data sample headers (one of): - row number containing headers (`rows` should contain headers rows) - list of headers (`rows` should NOT contain headers rows) - defaults to 1 |
#### schema$commit(strict) ⇒ <code>Boolean</code>
Update schema instance if there are in-place changes in the descriptor.
**Returns**: <code>Boolean</code> - returns true on success and false if not modified
**Throws**:
- <code>TableSchemaError</code> raises any error occurred in the process
| Param | Type | Description |
| --- | --- | --- |
| strict | <code>boolean</code> | alter `strict` mode for further work |
**Example**
```{r, eval=TRUE, include=TRUE}
descriptor <- '{"fields": [{"name": "field", "type": "string"}]}'
def <- Schema.load(descriptor)
schema <- value(def)
schema$getField('name')
schema$descriptor$fields[[1]]$type
schema$descriptor$fields[[1]]$type <-'number'
schema$descriptor$fields[[1]]$type
schema$commit()
```
#### schema$save(target) ⇒ <code>boolean</code>
Save schema descriptor to target destination.
**Returns**: <code>boolean</code> - returns true on success
**Throws**:
- <code>TableSchemaError</code> raises any error occurred in the process
| Param | Type | Description |
| --- | --- | --- |
| target | <code>string</code> | path where to save a descriptor |
#### Schema.load(descriptor, strict) ⇒ [<code>Schema</code>](#Schema)
Factory method to instantiate `Schema` class.
This method is async and it should be used with await keyword or as a `Promise`.
**Returns**: [<code>Schema</code>](#Schema) - returns schema class instance
**Throws**:
- <code>TableSchemaError</code> raises any error occurred in the process
| Param | Type | Description |
| --- | --- | --- |
| descriptor | <code>string</code> \| <code>Object</code> | schema descriptor: - local path - remote url - object |
| strict | <code>boolean</code> | flag to alter validation behaviour: - if false error will not be raised and all error will be collected in `schema.errors` - if strict is true any validation error will be raised immediately |
### Field
Field representation
* [Field](#Field)
* [new Field(descriptor, missingValues)](#new_Field_new)
* [$name](#Field+name) ⇒ <code>string</code>
* [$type](#Field+type) ⇒ <code>string</code>
* [$format](#Field+format) ⇒ <code>string</code>
* [$required](#Field+required) ⇒ <code>boolean</code>
* [$constraints](#Field+constraints) ⇒ <code>Object</code>
* [$descriptor](#Field+descriptor) ⇒ <code>Object</code>
* [$castValue(value, constraints)](#Field+castValue) ⇒ <code>any</code>
* [$testValue(value, constraints)](#Field+testValue) ⇒ <code>boolean</code>
#### new Field(descriptor, missingValues)
Constructor to instantiate `Field` class.
**Returns**: [<code>Field</code>](#Field) - returns field class instance
**Throws**:
- <code>TableSchemaError</code> raises any error occured in the process
| Param | Type | Description |
| --- | --- | --- |
| descriptor | <code>Object</code> | schema field descriptor |
| missingValues | <code>List.<string></code> | an list with string representing missing values |
#### field$name ⇒ <code>string</code>
Field name
#### field$type ⇒ <code>string</code>
Field type
#### field$format ⇒ <code>string</code>
Field format
#### field$required ⇒ <code>boolean</code>
Return true if field is required
#### field$constraints ⇒ <code>Object</code>
Field constraints
#### field$descriptor ⇒ <code>Object</code>
Field descriptor
#### field$castValue(value, constraints) ⇒ <code>any</code>
Cast value
**Returns**: <code>any</code> - cast value
| Param | Type | Description |
| --- | --- | --- |
| value | <code>any</code> | value to cast |
| constraints | <code>Object</code> \| <code>false</code> | |
#### field$testValue(value, constraints) ⇒ <code>boolean</code>
Check if value can be cast
| Param | Type | Description |
| --- | --- | --- |
| value | <code>any</code> | value to test |
| constraints | <code>Object</code> \| <code>false</code> | |
### validate(descriptor) ⇒ <code>Object</code>
This function is async so it has to be used with `await` keyword or as a `Promise`.
**Returns**: <code>Object</code> - returns `{valid, errors}` object
| Param | Type | Description |
| --- | --- | --- |
| descriptor | <code>string</code> \| <code>Object</code> | schema descriptor (one of): - local path - remote url - object |
### infer(source, headers, options) ⇒ <code>Object</code>
This function is async so it has to be used with `await` keyword or as a `Promise`.
**Returns**: <code>Object</code> - returns schema descriptor
**Throws**:
- <code>TableSchemaError</code> raises any error occured in the process
| Param | Type | Description |
| --- | --- | --- |
| source | <code>string</code> \| <code>List.<List></code> \| <code>Stream</code> \| <code>function</code> | data source (one of): - local CSV file (path) - remote CSV file (url) - list of lists representing the rows - readable stream with CSV file contents - function returning readable stream with CSV file contents |
| headers | <code>List.<string></code> | list of headers |
| options | <code>Object</code> | any `Table.load` options |
# Contributing
The project follows the [Open Knowledge International coding standards][coding_standards]. There are common commands to work with the project.Recommended way to get started is to create, activate and load the library environment. To install package and development dependencies into active environment:
```{r, eval=FALSE, include=TRUE}
devtools::install_github("frictionlessdata/tableschema-r", dependencies = TRUE)
```
To make test:
```{r, eval=FALSE, include=TRUE}
test_that(description, {
expect_equal(test, expected result)
})
```
To run tests:
```{r, eval=FALSE, include=TRUE}
devtools::test()
```
More detailed information about how to create and run tests you can find in [testthat package](https://github.com/hadley/testthat).
## Changelog - News
In [NEWS.md][news] described only breaking and the most important changes. The full changelog could be found in nicely formatted [commit][commits] history.
[Rs]: https://www.r-project.org/
[R]: https://cran.r-project.org/
[Rman]: https://cran.r-project.org/doc/manuals/R-admin.html
[Rstudio]: https://www.rstudio.com/
[Rstudiodown]: https://www.rstudio.com/products/rstudio/download/
[coding_standards]: https://github.com/okfn/coding-standards
[tableschema]: http://specs.frictionlessdata.io/table-schema/
[news]: https://github.com/frictionlessdata/tableschema-r/blob/master/NEWS.md
[commits]: https://github.com/frictionlessdata/tableschema-r/commits/master