NEWS | R Documentation |
memisc News
Version 0.99
NEW FEATURES
A new object-oriented infrastructure for the creation of HTML code is used in
format_html()
methods. This infrastructure is exposed by thehtml()
function.Support for with model groups in
mtable()
.c.mtable()
now creates groups of models, if arguments are tagged.Flattened contingency tables (
ftable()
s as they are created by the eponymous function in thestats()
package) can now be combined intoftable_matrix()
objects. This can be done by usingrbind()
orcbind()
.There is now an object class for survey items containing dates (without times), called
"Date.item"
.Support for including sandwich estimates of sampling variances and standard errors into the output of
summary()
andmtable()
, by the new generic functionswithVCov()
andwithSE()
.Support for different parameter sections is added to
mtable
. This is intended to allow output of mixed effects models to distinguish between ("fixed effects") coefficients and variance parameters.Objects created by
mtable()
also can have several header lines. Facilities to add additional header lines will be added soon.Optionally,
mtable()
shows the left-hand sides of model equations. This can be controlled by the optional argumentshow.eqnames
and by the global option"mtable.show.eqnames"
.Output of
mtable()
objects also include, if applicable, a note that explains the "significance stars" for p-values.Summary statistics reported by
mtable()
can now be selected for each object or object class (via calls tooptions()
) separately.It is now possible to compress the output concerning control variables in
mtable()
.Support for HTML and LaTeX output in Jupyter notebooks is added to objects created by
mtable()
andftable()
etc.The
toLatex()
method for "ftable" objects gains afold.leaders
option (with default valueFALSE
) which allows the row labels (leaders) to remain in a single column.A function
codeplan()
creates a data frame describing the structure of an "importer", "data.set" or "item" object. It is possible to copy this so described structure from one "data.set" object to another or to a data frame.New
$
and[[
operators for "importer" objects allow to create codebooks for single items/variables in imported data files.A
duplicated_labels()
function allows to show and describe duplicated labels and adeduplicate_labels()
function allows to get rid of such duplicates.New operators
%#%
,%##%
, and%@%
to manipulate annotations and other attributes.A
List()
function adds names to its elements by deparsing arguments in the same way asdata.frame()
does.A new function
Groups()
allows to split a data frame or a "data.set" into group based on factors in a more convenient way. There are methods ofwith()
andwithin()
to deal with resulting objects of class "grouped.data". For example, thewithin()
method allows to substract group means from the observations within groups.withinGroups()
allows to split a data frame or "data.set" objects into groups, make within-group computations and recombine the groups into the order of the original data frame or "data.set" object.A new function
Reshape()
simplifies the syntax to reshape data frames and "data.set" objects from wide into long or from long into wide format.'tibbles', including those created with the haven package can be translated into "data.set" objects without loss of information. Also "data.set" objects can be translated into 'tibbles' with minimal loss of information.
An extendable function
view()
allows to use theView()
facilities provided by graphical user interfaces (in particular RStudio) with objects not originally supported by these user interfaces. In addition,view()
methods for "codeplan", "decriptions", "data.set", and "inporter" are provided, which allow to conviently inspect the contents of these objects in RStudio.An "as.data.table" for coercing "data.set" objects directly into "data.table" objects.
It is now possible to specify the measurement level for a set of variables in a "data.set" objects, either by using the assignment operator with
measurement()
or by using the new functionset_measurement()
.There are convenience wrappers such as
Mean()
etc. formean()
etc. that have the default settingna.rm=TRUE
instead ofna.rm=FALSE
.A new
deduplicate_labels()
function allows to deal with duplicate labels (where several codes have the same label)It is now possible to create codebooks for weighted data.
The function
trim_labels()
allows to trim codes from value labels.The function
reverse()
allows to reorder the codes of a survey item in reverse order.The generic function
Means()
allows to conveniently obtain group means, optionally with standard errors and/or confidence intervals.The colon operator (
:
) can be used to refer to ranges of variables inforeach()
Code plans (objects in class "codeplan") can now be exported to and imported from YAML and JSON files.
A new generic function
format_md()
(contributed by Mael Astrud-Le Souder) allows to format R objects in Markdown. Currently, methods for codebooks (and entries in codebooks) are implemented.A new generic function
coarsen()
allows to coarsen numeric vectors into factors, based on a given number of categories.A new generic function
measurement_autolevel()
allows to automatically select the appropriate measurement level for survey items.A new operator
%if%
allows to assign values to a variable for observations that satisfy a condition.A new operator
%$$%
allows to abbreviate object modifications usingwithin()
, i.e. instead ofa <- within(a, { ... })
you can writea %$$% { ... }
IMPROVEMENTS
Subset methods for importer objects are much more memory efficient and now can handle files of size larger than 1GB.
-
useDcolumn
anduseBooktabs
arguments oftoLatex()
methods now have global options as defaults -
toLatex()
methods optionally escape dollar, subscript and superscript symbols. This can be set either by an explicit (new) argumenttoLatex.escape.tex
or by a global option with the same name. The
toLatex()
method for "ftable" objects has a new optionfold.leaders
.-
spss.system.file()
now translates numeric variables with any SPSS date format into a "datetime.item" The function
List()
adds names to the elements of the resulting list in a way similar to howdata.frame()
adds names to the columns of a data frame.-
Stata.file()
now handles files in format rev. 117 and later as they are created by Stata version later than 13. User definded missing values are now reported in separate tables in entries created by
codebook()
even if these entries refer to items with measurement level "interval" or "ratio".If the annotation or the labels of a non-item is set to NULL this no longer causes an error.
Changing varible names to lowercase while importing data sets with
Stata.file()
,spss.portable.file()
, andspss.system.file()
is now optional.Importer methods
Stata.file()
,spss.portable.file()
, andspss.system.file()
now have optional arguments that allow to deal with variable labels or value labels in non-native encoding (e.g.CP1252
on autf-8
platform).A function
spss.file()
acts as a common interface tospss.portable.file()
andspss.system.file()
.The function
head()
andtail()
now work with "data.set" and "importer" objects in the same sensible way as they do with data frames.The function
recode()
behaves more coherently: If a labelled vector is the result of 'recode' it gets the measurement level "nominal". Factor levels explictly created first come first in the order of factor levels.The function
spss.system.file()
now handles buggy SPSS system files that lack information about the number of variables in their header. (These files are typically created by the library ReadStat, used e.g. by the R package 'haven'.)SPSS syntax files are now converted to the encoding of the host system if they have a different one. By default, the original encoding is assumed to be Codepage 1252 (extended Latin-1).
-
codebook()
,codeplan()
,labels()
,value.filter
, and related functions returnNULL
forNULL
arguments. -
codeplan()
also works with indiviual survey items and can set toNULL
, which means that all memisc-specific information is removed from the data. -
codebook()
works also with data frames (or "tibbles") imported with the haven package. -
codebook()
now makes use of the "label" attribute of variables if the attribute is present. -
with(Groups())
,withGroups()
,within(Groups())
,withinGroups()
,Aggregate()
, andgenTable()
are considerably faster now. They can also make use of certain automatic variables such asn_
,i_
that contain group sizes and group indices. -
relabel()
,rename()
, anddimrename()
, do no longer require their arguments to be enclosed in quotation marks. Operators '$', '[', and '[[' can now be appied to codebook objects to get a codebook of a subset of the varaibles.
-
spss.system.file()
now uses information contained in SPSS files (if available) to determine the measurement level of the improrted variables. -
spss.system.file()
uses information about the character set encoding if available in the file to translate variable labels and value labels into the coding of the machine on which R is being run. -
spss.system.file()
also (optionally) uses information about the intended measurement level fo variables in the file. -
as.item()
now drops non-unique labelled values when applied to a "labelled", "haven_labelled", or "haven_labelled_spss" object. -
spss.system.file()
no takes into account metadata about measurement levels ("nominal", "ordinal", or "scale") to set themeasurement()
attributes of the items in the resulting"importer"
and"data.set"
objects. -
mtable()
now handles objects of class "clmm" (from package "ordinal") and the handling of objects of class "merMod" (from package "lme4") is more consistent with those of class "glm" (e.g. the number of observations is shown). Variance component estimates of "merMod" and "clmm" objects are reported as distinct statistics.
-
recode()
has a new optional argumentcode=
. IfTRUE
, existing codes (and labels) are retained. -
recode()
now allows to recode factors into numeric vectors. If the change in codes done by
recode()
merely reorders codes, labels are reordered accordingly, unless labels are explicitly given.-
subset()
is S3-generic again, as this allows for lazy evaluation of its arguments. -
cases()
handlesNAs
more sensibly - if a case condition isTRUE
this leads to a non-NA
result even if other conditions evaluate toFALSE
, ifcases()
is called withna.rm=TRUE
(the default). The result of
subset
and of the bracket-operator ([]
) applied to importer objects has row names that indicate the rows selected from the full data.A method of
format
for data set objects is added.The row names of subsets fo importer objects reflect the row numbers in the original data.
-
collect.data.frame
andcollect.data.set
gain ause_last
and adetailed_warnings
option to improve handling of variables with different attributes in different objects being collected. -
spss.system.file()
,spss.portable.file()
, andStata.file()
get an optionalnegative2missing
argument. -
recode()
keepsNA
s asNA
s when anotherwise
argument is given andNA
s are not recoded explicitly. -
codebook()
now fully supports logical vectors. HTML output created by
format_html
etc. now uses '<style>' elements for formatting. This reduces the size of created HTML code.
BUGFIXES
-
str
andls.str
are imported from theutils
package to prevent a NOTE in R CMD check HTML tables and lists are no loger wrapped in HTML paragraphs in
format_html.CodebookEntry
.-
show
andcodebookEntry
methods for the "datetime.item" now work asexpected. -
cases
handlesNA
s more gracefully -
toLatex.ftable
output has been improved: No attempt at showing non-existent variable names, better application ofextracolsep
. Duplicate value labels now produce an error if item object is coerced into a factor.
A bug concerning missing values in SPSS files is fixed.
Headlines in vignettes are now coherent.
-
mtable
with empty summary sections can be created (again). Objects returned by
mtable
return objects with class "memisc_mtable" to avoid name clash with objects created by themodel.table
in package "stats".Calls to PROTECTION are added to the C-source to prevent protection errors.
-
toLatex()
now handles matrices in data frames. -
spss.portable.file()
now handles files with weighting variables and empty variable labels. -
spss.fixed.file()
now handles files with lines that are longer than the number of columns specified in the columns definition file. -
spss.system.file()
now correctly imports value labels of string variables. Some PROTECTION issus in the C-source flagged by Tomas Kalibera's
rchk
utility are fixed.If "data.set" objects are combined and succeeding objects contain "items" not contained in the preceding ones, the result now will still be a valid "data.set" object.
-
seekData
etc. no longer try to recreate external pointers in order to avoid segmentation faults. Also the deletion of empty pointers is avoided for the same reason. -
as.data.set
works for "tibbles" also when method dispatch via class inheritance does not work. -
codebook()
now handles character variables in SPSS system files correclty. -
codebook()
uses the appropriate logical operator in checking for missings.
USER-VISIBLE CHANGES
All vignettes are now using knitr.
HTML output uses unicode characters by default instead of amersand-escapes to enhance compatibility with pandoc.
-
codebook()
no longer shows the skewness and kurtosis of numeric variables to save output space.
DEFUNCT
The function
UnZip
has been removed from the package.unzip
in conjunction withsystem.file
does the same job, as can be seen in the example forspss.portable.file
.
Version 0.98
NEW FEATURES
Support for exporting results of various functions into HTML format is now supported by the function
format_html
. This should make it easier to import them into HTML or word-processing documents (that support importing HTML). A preview of the HTML is made available by the new (generic) functionshow_html
.In particular, results of the functions
mtable
(i.e. tables of model estimates),ftable
(i.e. flattened contingency tables etc.), andcodebooks
, can be exported int. HTML usingformat_html
. Also data frames can be exported into HTML.A function
dsView
is added, which allows a display ofdata.set
objects similar asView
displays data frames.-
mtable
now handles multi-equation models better, in particular if the model objects supplied as arguments vary in the number and/or names of the equations. There is also a new option to place confidence intervals to the right of coefficient estimates. Furthermtable
gains the following optional aguments:-
show.baselevel
, which allows to suppress the display of baseline categories of dummy variables, when dummy variable coefficients are displayed -
sdigits
, to specify the number of digits of summary statistics. -
gs.options
, to pass optional arguments togetSummary
, allowin for more flexibility in creating tables.
One can now use a
summaryTemplate
generic function for formatting model summaries, in addition to set the template bysetSummaryTemplate
. Finally, parts of "mtables" can be extracted using the[
operator as with matrices, and "mtables" can now also be concatenated. -
There is now an object class for survey items containing dates and times, called
"datetime.item"
There is a new function
wild.codes
to check wild codes (i.e. unlabelled codes of an otherwise labelled item.)-
codebook
now supports data frames, factors, and numeric vectors. A
toLatex
method exists now fordata.set
objects, data frames and other objects.A new
percentages
function is added to allow easy creation of tables of percentages.
BUGFIXES
-
spss.fixed.file
is now able to handle labelled strings andvalue labels
andmissing values
statements. Internal C-code used by
spss.fixed.file
no longer assumed that arguments are copied – some strange behaviour of objects created byspss.fixed.file
is now corrected.Description of items in external data sources is more complete now - the same information as for items in internal
data.sets
.-
applyTemplate
now returns empty strings for undefined quantities. -
collect
method fordata.sets
now works as expected. -
spss.fixed.file
now checks whether there are undefined variables invarlab.file
etc. -
Stata.file
now can import Stata 9 and Stata 10 files.
USER-VISIBLE CHANGES
Argument
drop
no longer used by functionmtable
.Format of file produced by
write.mtable
can now be specified using aformat=
argument. ButforLaTeX=TRUE
still can be used to get LaTeX files.
DEFUNCT
The functions
Termplot
,Simulate
, andpanel.errbars
are defunct. Graphics similar to those built withpanel.errbars
can be created with facilities provided by the package "mplot", which is currently available on GitHub.
Version 0.97
NEW FEATURES
-
spss.system.file
andspss.portable.file
gain atolower=
argument that defaults toTRUE
, which allows to change annoying all-upper-case variable names to lower case New generic function
Iconv()
that allows to change the character enconding of variable descriptions and value labels. It has methods for"data.set"
,"importer"
,"item"
,"annotation"
, and"value.label"
objects.There is now a method of
as.character()
for"codebook"
objects and a convenience functionWrite()
with methods for"codebook"
and"description"
to make it more convenient to direct the output ofcodebook()
anddescription()
into text files.A method for
"merMod"
objects of thegetSummary()
generic function.mtable()
now should be able (again) to handle estimation results produced bylmer()
andglmer()
from package 'lme4'.-
recode()
handles character vectors in a more convenient way: They are converted into factors with sorted unique values (after recoding) as levels.
USER-VISIBLE CHANGES
-
getSummary.expCoef
is renamed intogetSummary_expCoef
.
DEFUNCT
S3 method
aggregate.formula
has been removed from the package to avoid clash with method of the same name in thebase
package. The functionAggregate
can be used instead.Removed
include
,uninclude
, anddetach.sources
as these are flagged as modifying the global namespace.