Create a term-document matrix
This example creates a data set that represents the term-document matrix for all the documents with paths contained in a variable of the input data set.
In order to run this example just paste the following statements in the Command area and press the button: EXECUTE. To view the content of the resulting data set go in the PATH tab and refer to the ones named Term_document_matrix and Information.
Note that the first data set is the TDM, while the second permits to know the association between the columns of the TDM and the files.
More options to execute the procedure are accessible through the GUI of ADaMSoft by clicking the button Execute from the List of procedure tab, after having selected the Text mining (create the term document matrix and similar) and the link to the Create a term document matrix for all the files...
This example creates a data set that represents the term-document matrix for all the documents with paths contained in a variable of the input data set.
In order to run this example just paste the following statements in the Command area and press the button: EXECUTE. To view the content of the resulting data set go in the PATH tab and refer to the ones named Term_document_matrix and Information.
Note that the first data set is the TDM, while the second permits to know the association between the columns of the TDM and the files.
More options to execute the procedure are accessible through the GUI of ADaMSoft by clicking the button Execute from the List of procedure tab, after having selected the Text mining (create the term document matrix and similar) and the link to the Create a term document matrix for all the files...
Proc Multiwordcounter dict=Dircontent out=term_document_matrix outdoclist=information; varpathfiles name; varaddinfodoc filenamenoext; casesensitive ; onlyascii ; nonumbers ; run; |