[Research] Changes of h/utils and h/dset

Joey Quansheng Liang qsliang at cmu.edu
Tue Feb 21 10:12:31 EST 2006


1. Added a new function in h/utils/am_string.h(c):

/** Convert a string to double value, if possible
     input:
         const char *: The target null-terminated string
     output:
         double *:  The target address holding the double value
                    The value will not be changed if function
                        returns 'false'
     return:
         bool: 'true' if succeed, or else 'false'

     Example of converting:
           "213.21" => 213.21
           "$212,212" => 212212
           "($23,323,212)" => -2.33232e+007
           "(843" is not a number, return 'false'
           "  -$54,432" => -54432
           "34,234%" => 342.34
           "2.34e-1" => 0.234
           "2.34E1.2" => 37.0865
           "50/3" => 16.6667

     Some combinations are not allowed, like:
          "(34.21E2)" : Neither a scientific nor accounting notation
          "3.1415926/25" : Not a fraction but a formula
     Some combinations are allowed although it may not be so meaningful,
        like:
          "$3278%" => 32.78
*/
bool string_to_double(const char *s, double *pValue);


2. The datset realization now uses the function above to convert symbols 
to doubles. So many new formatted symbols can be realized now.
    For my curiousness, I did a performance test (Linux, debug):
    1. Generated a 10,000 rows by 20 cols datset with random numbers;
    2. Forcedly loaded the datset all as symbolic;
    3. Made a copy of the loaded datset;
    4. Realized all 20 columns by using old and new realization code, 
for each datset;
    The old code took 7 seconds, the new code took 6 seconds. That's great!


Let me know if there is any question.
-- 
Joey (QuanSheng) Liang
Auton Lab, RI, SCS, Carnegie Mellon University
W: http://www.AutonLab.org/
P: http://www.ChemiLab.net/



More information about the Autonlab-research mailing list