uk.ac.ebi.adfconverter.tools.checker
Class FileChecker

java.lang.Object
  extended byuk.ac.ebi.adfconverter.tools.checker.FileChecker
Direct Known Subclasses:
ADFChecker

public class FileChecker
extends java.lang.Object

Class for file strcuture checking will check file structure and data contained in a file

Version:
1 08 July 2004 TODO test with an hashtable to avoid read several time the same file
Author:
Pierre MARGUERITE

Constructor Summary
FileChecker()
          Default Constructor of FileChecker
FileChecker(java.lang.String structureFile)
          Constructor of FileChecker with a given structure file
 
Method Summary
 boolean checkCurrentRowFieldDependance(java.util.Hashtable itemTable)
          Checks dependances between fields of a row.
 boolean checkDataTable(DataTable dataTable, HeaderType header, boolean strictMode, boolean doCuration, boolean stepByStepMode)
          * checks a data table against a structure file
 boolean checkFile(java.io.File checkFile, FileType fileStructure, boolean strictMode, boolean doCuration, boolean stepByStepMode)
          Checks a data file .
 CorrectableDataTable checkFileAndCurate(java.io.File checkFile, FileType fileStructure, boolean strictMode, boolean stepByStepMode)
          Checks and curates a data file.
 CorrectableDataTable checkTableAndCurate(DataTable table, HeaderType headerStructure, boolean strictMode, boolean stepByStepMode)
          checks a data table against a structure file and returns the corrected/curated (if needed) data table
static java.lang.String convert2Regex(java.lang.String regex)
          Characters // x The character x // \\ The backslash character // \0n The character with octal value 0n (0 <= n <= 7) // \0nn The character with octal value 0nn (0 <= n <= 7) // \0mnn The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7) // \xhh The character with hexadecimal value 0xhh //// \ uhhhh The character with hexadecimal value 0xhhhh // \t The tab character (' ') // \n The newline (line feed) character (' ') // \r The carriage-return character (' ') // \f The form-feed character (' ') // \a The alert (bell) character ('') // \e The escape character ('') // \cx The control character corresponding to x // Character classes // [abc] a, b, or c (simple class) // [^abc] Any character except a, b, or c (negation) // [a-zA-Z] a through z or A through Z, inclusive (range) // [a-d[m-p]] a through d, or m through p: [a-dm-p] (union) // [a-z&&[def]] d, e, or f (intersection) // [a-z&&[^bc]] a through z, except for b and c: [ad-z] (subtraction) // [a-z&&[^m-p]] a through z, and not m through p: [a-lq-z](subtraction) // Predefined character classes // .
 CorrectableDataTable getCuratedTable()
          Retrieves the curated data table for the current data file
 FileType getFileStructure(java.lang.String structureName, java.lang.String _structureFile)
          Retrieves the FileType object from the XML file describing the file structure
 HeaderType getHeaderStructure(java.lang.String structureName, java.lang.String _structureFile)
          Retrieves the HeaderType object corresponding to a given header structure
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FileChecker

public FileChecker()
Default Constructor of FileChecker


FileChecker

public FileChecker(java.lang.String structureFile)
Constructor of FileChecker with a given structure file

Parameters:
structureFile - the file to check
Method Detail

checkCurrentRowFieldDependance

public boolean checkCurrentRowFieldDependance(java.util.Hashtable itemTable)
Checks dependances between fields of a row. A value in a field can be mandatory in a field, if a value is present in another field?

Parameters:
itemTable - list of none empty fields in a row
Returns:
true, if dependances are verified/correct. null, otherwise

checkDataTable

public boolean checkDataTable(DataTable dataTable,
                              HeaderType header,
                              boolean strictMode,
                              boolean doCuration,
                              boolean stepByStepMode)
                       throws ErrorInitLog,
                              IncorrectFile,
                              IncorrectFileStructure,
                              java.io.IOException,
                              IncorrectDataTableException
* checks a data table against a structure file

Parameters:
dataTable - the data table to check
header - the structure definition of the table header (header information)
doCuration - true, if data are curated and export in a file
strictMode - checking in strictmode
stepByStepMode - define step by step mode. null, otherwise.
Returns:
true if the datatable is correct. false, otherwise.
Throws:
ErrorInitLog - if an error occurs during the logger initialisation
IncorrectFile - if a file is incorrect.
IncorrectFileStructure - if the structure definition file is incorrect
java.io.IOException - if an error occurs during file access.
IncorrectDataTableException - if the data table is incorrect

checkFile

public boolean checkFile(java.io.File checkFile,
                         FileType fileStructure,
                         boolean strictMode,
                         boolean doCuration,
                         boolean stepByStepMode)
                  throws ErrorInitLog,
                         IncorrectFile,
                         IncorrectFileStructure,
                         java.io.IOException,
                         IncorrectDataTableException,
                         IncorrectDataFile
Checks a data file .

Parameters:
checkFile - the file containing data to check
fileStructure - path to the file containing the description of the file structure
strictMode - if the checking is done strictly.
doCuration - if data are curated and exported in a file with _curated_ before the file extension
stepByStepMode - if the checking is done step by step (stop after each discover error)
Returns:
true if the file is matching with the given file structure definition
Throws:
ErrorInitLog - if an error occurs during logger initialisation
IncorrectFile - if the data file is incorrect against the structure file.
IncorrectFileStructure - if the structure is unvalidated.
java.io.IOException - if an error during access to data file
IncorrectDataTableException - if the data are incorrect against the structure file
IncorrectDataFile - if the data file is incorrect

checkFileAndCurate

public CorrectableDataTable checkFileAndCurate(java.io.File checkFile,
                                               FileType fileStructure,
                                               boolean strictMode,
                                               boolean stepByStepMode)
                                        throws ErrorInitLog,
                                               IncorrectFile,
                                               IncorrectFileStructure,
                                               java.io.IOException,
                                               IncorrectDataTableException,
                                               IncorrectDataFile
Checks and curates a data file.

Parameters:
checkFile - path to the data file
fileStructure - file describing the structure of the data file
strictMode - if the checking is strict
stepByStepMode - if checking is done step by step
Returns:
the curated data table corresponding to the data file
Throws:
ErrorInitLog - if an error occurs during logger initialisation
IncorrectFile - if the data file is incorrect
IncorrectFileStructure - if the structure file is incorrect
java.io.IOException - if an error occurs during file access
IncorrectDataTableException - if an error occurs
IncorrectDataFile - if the data file is incorrect

checkTableAndCurate

public CorrectableDataTable checkTableAndCurate(DataTable table,
                                                HeaderType headerStructure,
                                                boolean strictMode,
                                                boolean stepByStepMode)
                                         throws ErrorInitLog,
                                                IncorrectFile,
                                                IncorrectFileStructure,
                                                java.io.IOException,
                                                IncorrectDataTableException,
                                                IncorrectDataFile
checks a data table against a structure file and returns the corrected/curated (if needed) data table

Parameters:
table - the data table to check
headerStructure - the structure definition of the table header (header information)
strictMode - checking in strictmode
stepByStepMode - define step by step mode. null, otherwise.
Returns:
the checked/curated table, if the table is correct. null, otherwise.
Throws:
ErrorInitLog - if an error occurs during the logger initialisation
IncorrectFile
IncorrectFileStructure - if the structure definition file is incorrect
java.io.IOException - if an error occurs during file access.
IncorrectDataTableException - if the data table is incorrect
IncorrectDataFile - if the file containing the data is incorrect

convert2Regex

public static java.lang.String convert2Regex(java.lang.String regex)
Characters // x The character x // \\ The backslash character // \0n The character with octal value 0n (0 <= n <= 7) // \0nn The character with octal value 0nn (0 <= n <= 7) // \0mnn The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7) // \xhh The character with hexadecimal value 0xhh //// \ uhhhh The character with hexadecimal value 0xhhhh // \t The tab character (' ') // \n The newline (line feed) character (' ') // \r The carriage-return character (' ') // \f The form-feed character (' ') // \a The alert (bell) character ('') // \e The escape character ('') // \cx The control character corresponding to x // Character classes // [abc] a, b, or c (simple class) // [^abc] Any character except a, b, or c (negation) // [a-zA-Z] a through z or A through Z, inclusive (range) // [a-d[m-p]] a through d, or m through p: [a-dm-p] (union) // [a-z&&[def]] d, e, or f (intersection) // [a-z&&[^bc]] a through z, except for b and c: [ad-z] (subtraction) // [a-z&&[^m-p]] a through z, and not m through p: [a-lq-z](subtraction) // Predefined character classes // . Any character (may or may not match line terminators) // \d A digit: [0-9] // \D A non-digit: [^0-9] // \s A whitespace character: [ \t\n\x0B\f\r] // \S A non-whitespace character: [^\s] // \w A word character: [a-zA-Z_0-9] // \W A non-word character: [^\w] // // POSIX character classes (US-ASCII only) // \p{Lower} A lower-case alphabetic character: [a-z] // \p{Upper} An upper-case alphabetic character:[A-Z] // \p{ASCII} All ASCII:[\x00-\x7F] // \p{Alpha} An alphabetic character:[\p{Lower}\p{Upper}] // \p{Digit} A decimal digit: [0-9] // \p{Alnum} An alphanumeric character:[\p{Alpha}\p{Digit}] // \p{Punct} Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ // \p{Graph} A visible character: [\p{Alnum}\p{Punct}] // \p{Print} A printable character: [\p{Graph}] // \p{Blank} A space or a tab: [ \t] // \p{Cntrl} A control character: [\x00-\x1F\x7F] // \p{XDigit} A hexadecimal digit: [0-9a-fA-F] // \p{Space} A whitespace character: [ \t\n\x0B\f\r] // Classes for Unicode blocks and categories // \p{InGreek} A character in the Greek block (simple block) // \p{Lu} An uppercase letter (simple category) // \p{Sc} A currency symbol // \P{InGreek} Any character except one in the Greek block (negation) // [\p{L}&&[^\p{Lu}]] Any letter except an uppercase letter (subtraction) // Boundary matchers // ^ The beginning of a line // $ The end of a line // \b A word boundary // \B A non-word boundary // \A The beginning of the input // \G The end of the previous match // \Z The end of the input but for the final terminator, if any // \z The end of the input // Greedy quantifiers // X? X, once or not at all // X* X, zero or more times // X+ X, one or more times // X{n} X, exactly n times // X{n,} X, at least n times // X{n,m} X, at least n but not more than m times // Reluctant quantifiers // X?? X, once or not at all // X*? X, zero or more times // X+? X, one or more times // X{n}? X, exactly n times // X{n,}? X, at least n times // X{n,m}? X, at least n but not more than m times // Possessive quantifiers // X?+ X, once or not at all // X*+ X, zero or more times // X++ X, one or more times // X{n}+ X, exactly n times // X{n,}+ X, at least n times // X{n,m}+ X, at least n but not more than m times // // this.logical operators // XY X followed by Y // X|Y Either X or Y // (X) X, as a capturing group // Back references // \n Whatever the nth capturing group matched // Quotation // \ Nothing, but quotes the following character // \Q Nothing, but quotes all characters until \E // \E Nothing, but ends quoting started by \Q // Special constructs (non-capturing) // (?:X) X, as a non-capturing group // (?idmsux-idmsux) Nothing, but turns match flags on - off // (?idmsux-idmsux:X) X, as a non-capturing group with the given flags on - off // (?=X) X, via zero-width positive lookahead // (?!X) X, via zero-width negative lookahead // (?<=X) X, via zero-width positive lookbehind // (?X) X, as an independent, non-capturing group // TODO convertString 2 Regex dont deal with every charachter Convert a String in a Java regex: special charaters are not recognised in that String.

Parameters:
regex - the regex to convert
Returns:
the String corresponding to the provided regex String , modified to be recognized as a Java regex

getCuratedTable

public CorrectableDataTable getCuratedTable()
Retrieves the curated data table for the current data file

Returns:
the curated table

getFileStructure

public FileType getFileStructure(java.lang.String structureName,
                                 java.lang.String _structureFile)
                          throws java.lang.Exception
Retrieves the FileType object from the XML file describing the file structure

Parameters:
structureName - the name FileType object in the xml tree
_structureFile - the xml file containing the file structure description
Returns:
the FileType object corresponding to the structure name if it is contained ithe structureFile. If not found, null is return
Throws:
UnknownStructure - if the structure file is incorrect
java.lang.Exception - if an error occurs during the retrieving

getHeaderStructure

public HeaderType getHeaderStructure(java.lang.String structureName,
                                     java.lang.String _structureFile)
                              throws UnknownStructure,
                                     java.lang.Exception
Retrieves the HeaderType object corresponding to a given header structure

Parameters:
structureName - the name of the header
_structureFile - the file containing the xml description
Returns:
the HeaderType object corresponding to the structureName in the xml file
Throws:
UnknownStructure - if the xml file is incorrect
java.lang.Exception - if an error occurs during the retrieving


European Bioinformatics InstituteMicroarray Informatics Team