uk.ac.ebi.adfconverter.tools.checker
Class OptimizedFileChecker

java.lang.Object
  extended byuk.ac.ebi.adfconverter.tools.checker.OptimizedFileChecker
Direct Known Subclasses:
OptimizedADFChecker

public class OptimizedFileChecker
extends java.lang.Object

Checks data tables in an optimized way ( memory, CPU) Data tables are checked row by row.

Since:
6 févr. 2005
Version:
1
Author:
Pierre MARGUERITE

Constructor Summary
OptimizedFileChecker()
          Constructor
 
Method Summary
 boolean checkCurrentRowFieldDependance(java.util.Hashtable itemTable)
          Checks dependances between fields of a row.
 java.util.ArrayList checkDocument(java.io.File file, java.lang.String structureFile, boolean strictMode, boolean stepByStepMode, boolean doCuration)
          Checks a file against a document structure (contains in a XML file)
 boolean checkFile(java.io.File file, java.lang.String structureFile, boolean strictMode, boolean stepByStepMode, boolean doCuration)
           
 boolean checkFile(java.io.File file, java.lang.String structureFile, java.lang.String structureName, boolean strictMode, boolean stepByStepMode, boolean doCuration)
           
 CorrectableDataTable checkFileAndCurate(java.io.File checkFile, java.lang.String fileStructure, boolean strictMode, boolean stepByStepMode)
          Checks and curates a data file.
 CorrectableDataTable checkFileAndCurate(java.io.File checkFile, java.lang.String fileStructure, java.lang.String structureName, boolean strictMode, boolean stepByStepMode)
          Checks and curates a data file.
 boolean checkFileData(java.io.File file, java.lang.String structureFile, java.lang.String structureName, boolean strictMode, boolean stepByStepMode, boolean doCuration)
           
 boolean checkVerticalDataTable(DataTable dataTable, HeaderType header, boolean strictMode, boolean doCuration, boolean stepByStepMode)
          checks a vertical data table against a structure file The whole table is checked ( not row by row)
static java.lang.String convert2Regex(java.lang.String regex)
          Characters // x The character x // \\ The backslash character // \0n The character with octal value 0n (0 <= n <= 7) // \0nn The character with octal value 0nn (0 <= n <= 7) // \0mnn The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7) // \xhh The character with hexadecimal value 0xhh //// \ uhhhh The character with hexadecimal value 0xhhhh // \t The tab character (' ') // \n The newline (line feed) character (' ') // \r The carriage-return character (' ') // \f The form-feed character (' ') // \a The alert (bell) character ('') // \e The escape character ('') // \cx The control character corresponding to x // Character classes // [abc] a, b, or c (simple class) // [^abc] Any character except a, b, or c (negation) // [a-zA-Z] a through z or A through Z, inclusive (range) // [a-d[m-p]] a through d, or m through p: [a-dm-p] (union) // [a-z&&[def]] d, e, or f (intersection) // [a-z&&[^bc]] a through z, except for b and c: [ad-z] (subtraction) // [a-z&&[^m-p]] a through z, and not m through p: [a-lq-z](subtraction) // Predefined character classes // .
 CorrectableDataTable getCuratedTable()
          Retrieves the curated data table for the current data file
static void main(java.lang.String[] argv)
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

OptimizedFileChecker

public OptimizedFileChecker()
Constructor

Method Detail

checkCurrentRowFieldDependance

public boolean checkCurrentRowFieldDependance(java.util.Hashtable itemTable)
Checks dependances between fields of a row. A value in a field can be mandatory in a field, if a value is present in another field?

Parameters:
itemTable - list of none empty fields in a row
Returns:
true, if dependances are verified/correct. null, otherwise

checkDocument

public java.util.ArrayList checkDocument(java.io.File file,
                                         java.lang.String structureFile,
                                         boolean strictMode,
                                         boolean stepByStepMode,
                                         boolean doCuration)
Checks a file against a document structure (contains in a XML file)

Parameters:
file - data file to check
structureFile - the structure file containing the structure definition
strictMode - true, if the strict checking mode is enabled. false, otherwise
stepByStepMode - true, if the strict checking mode is enabled. false, otherwise
Returns:

checkFile

public boolean checkFile(java.io.File file,
                         java.lang.String structureFile,
                         boolean strictMode,
                         boolean stepByStepMode,
                         boolean doCuration)

checkFile

public boolean checkFile(java.io.File file,
                         java.lang.String structureFile,
                         java.lang.String structureName,
                         boolean strictMode,
                         boolean stepByStepMode,
                         boolean doCuration)

checkFileAndCurate

public CorrectableDataTable checkFileAndCurate(java.io.File checkFile,
                                               java.lang.String fileStructure,
                                               boolean strictMode,
                                               boolean stepByStepMode)
                                        throws ErrorInitLog,
                                               IncorrectFile,
                                               IncorrectFileStructure,
                                               java.io.IOException,
                                               IncorrectDataTableException,
                                               IncorrectDataFile
Checks and curates a data file.

Parameters:
checkFile - path to the data file
fileStructure - file describing the structure of the data file
strictMode - if the checking is strict
stepByStepMode - if checking is done step by step
Returns:
the curated data table corresponding to the data file
Throws:
ErrorInitLog - if an error occurs during logger initialisation
IncorrectFile - if the data file is incorrect
IncorrectFileStructure - if the structure file is incorrect
java.io.IOException - if an error occurs during file access
IncorrectDataTableException - if an error occurs
IncorrectDataFile - if the data file is incorrect

checkFileAndCurate

public CorrectableDataTable checkFileAndCurate(java.io.File checkFile,
                                               java.lang.String fileStructure,
                                               java.lang.String structureName,
                                               boolean strictMode,
                                               boolean stepByStepMode)
                                        throws ErrorInitLog,
                                               IncorrectFile,
                                               IncorrectFileStructure,
                                               java.io.IOException,
                                               IncorrectDataTableException,
                                               IncorrectDataFile
Checks and curates a data file.

Parameters:
checkFile - path to the data file
fileStructure - file describing the structure of the data file
strictMode - if the checking is strict
stepByStepMode - if checking is done step by step
Returns:
the curated data table corresponding to the data file
Throws:
ErrorInitLog - if an error occurs during logger initialisation
IncorrectFile - if the data file is incorrect
IncorrectFileStructure - if the structure file is incorrect
java.io.IOException - if an error occurs during file access
IncorrectDataTableException - if an error occurs
IncorrectDataFile - if the data file is incorrect

checkFileData

public boolean checkFileData(java.io.File file,
                             java.lang.String structureFile,
                             java.lang.String structureName,
                             boolean strictMode,
                             boolean stepByStepMode,
                             boolean doCuration)

checkVerticalDataTable

public boolean checkVerticalDataTable(DataTable dataTable,
                                      HeaderType header,
                                      boolean strictMode,
                                      boolean doCuration,
                                      boolean stepByStepMode)
                               throws IncorrectFileStructure,
                                      IncorrectDataTableException
checks a vertical data table against a structure file The whole table is checked ( not row by row)

Parameters:
dataTable - the data table to check
header - the structure definition of the table header (header information)
doCuration - true, if data are curated and export in a file
strictMode - checking in strictmode
stepByStepMode - define step by step mode. null, otherwise.
Returns:
true if the datatable is correct. false, otherwise.
Throws:
ErrorInitLog - if an error occurs during the logger initialisation
IncorrectFile - if a file is incorrect.
IncorrectFileStructure - if the structure definition file is incorrect
java.io.IOException - if an error occurs during file access.
IncorrectDataTableException - if the data table is incorrect

convert2Regex

public static java.lang.String convert2Regex(java.lang.String regex)
Characters // x The character x // \\ The backslash character // \0n The character with octal value 0n (0 <= n <= 7) // \0nn The character with octal value 0nn (0 <= n <= 7) // \0mnn The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7) // \xhh The character with hexadecimal value 0xhh //// \ uhhhh The character with hexadecimal value 0xhhhh // \t The tab character (' ') // \n The newline (line feed) character (' ') // \r The carriage-return character (' ') // \f The form-feed character (' ') // \a The alert (bell) character ('') // \e The escape character ('') // \cx The control character corresponding to x // Character classes // [abc] a, b, or c (simple class) // [^abc] Any character except a, b, or c (negation) // [a-zA-Z] a through z or A through Z, inclusive (range) // [a-d[m-p]] a through d, or m through p: [a-dm-p] (union) // [a-z&&[def]] d, e, or f (intersection) // [a-z&&[^bc]] a through z, except for b and c: [ad-z] (subtraction) // [a-z&&[^m-p]] a through z, and not m through p: [a-lq-z](subtraction) // Predefined character classes // . Any character (may or may not match line terminators) // \d A digit: [0-9] // \D A non-digit: [^0-9] // \s A whitespace character: [ \t\n\x0B\f\r] // \S A non-whitespace character: [^\s] // \w A word character: [a-zA-Z_0-9] // \W A non-word character: [^\w] // // POSIX character classes (US-ASCII only) // \p{Lower} A lower-case alphabetic character: [a-z] // \p{Upper} An upper-case alphabetic character:[A-Z] // \p{ASCII} All ASCII:[\x00-\x7F] // \p{Alpha} An alphabetic character:[\p{Lower}\p{Upper}] // \p{Digit} A decimal digit: [0-9] // \p{Alnum} An alphanumeric character:[\p{Alpha}\p{Digit}] // \p{Punct} Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ // \p{Graph} A visible character: [\p{Alnum}\p{Punct}] // \p{Print} A printable character: [\p{Graph}] // \p{Blank} A space or a tab: [ \t] // \p{Cntrl} A control character: [\x00-\x1F\x7F] // \p{XDigit} A hexadecimal digit: [0-9a-fA-F] // \p{Space} A whitespace character: [ \t\n\x0B\f\r] // Classes for Unicode blocks and categories // \p{InGreek} A character in the Greek block (simple block) // \p{Lu} An uppercase letter (simple category) // \p{Sc} A currency symbol // \P{InGreek} Any character except one in the Greek block (negation) // [\p{L}&&[^\p{Lu}]] Any letter except an uppercase letter (subtraction) // Boundary matchers // ^ The beginning of a line // $ The end of a line // \b A word boundary // \B A non-word boundary // \A The beginning of the input // \G The end of the previous match // \Z The end of the input but for the final terminator, if any // \z The end of the input // Greedy quantifiers // X? X, once or not at all // X* X, zero or more times // X+ X, one or more times // X{n} X, exactly n times // X{n,} X, at least n times // X{n,m} X, at least n but not more than m times // Reluctant quantifiers // X?? X, once or not at all // X*? X, zero or more times // X+? X, one or more times // X{n}? X, exactly n times // X{n,}? X, at least n times // X{n,m}? X, at least n but not more than m times // Possessive quantifiers // X?+ X, once or not at all // X*+ X, zero or more times // X++ X, one or more times // X{n}+ X, exactly n times // X{n,}+ X, at least n times // X{n,m}+ X, at least n but not more than m times // // this.logical operators // XY X followed by Y // X|Y Either X or Y // (X) X, as a capturing group // Back references // \n Whatever the nth capturing group matched // Quotation // \ Nothing, but quotes the following character // \Q Nothing, but quotes all characters until \E // \E Nothing, but ends quoting started by \Q // Special constructs (non-capturing) // (?:X) X, as a non-capturing group // (?idmsux-idmsux) Nothing, but turns match flags on - off // (?idmsux-idmsux:X) X, as a non-capturing group with the given flags on - off // (?=X) X, via zero-width positive lookahead // (?!X) X, via zero-width negative lookahead // (?<=X) X, via zero-width positive lookbehind // (?X) X, as an independent, non-capturing group // TODO convertString 2 Regex dont deal with every charachter Convert a String in a Java regex: special charaters are not recognised in that String.

Parameters:
regex - the regex to convert
Returns:
the String corresponding to the provided regex String , modified to be recognized as a Java regex

getCuratedTable

public CorrectableDataTable getCuratedTable()
Retrieves the curated data table for the current data file

Returns:
the curated table

main

public static void main(java.lang.String[] argv)


European Bioinformatics InstituteMicroarray Informatics Team