Development

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • data managment

    3 answers - 1683 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    First I would really like to thank the mailing list for help I got in the
    past, as a new to R I am really needing some support on hoe to code the
    following problem.
    I am trying to sort some data I have in a big file. The file has 4 columns
    and 19000 rows. An example of it looks like this:-
    G 0.892 A 0.108
    G 0.883 T 0.117
    T 0.5 C 0.5
    A 0.617 G 0.383
    G 0.925 A 0.075
    A 0.967 G 0.033
    C 0.883 T 0.117
    C 0.633 T 0.367
    G 0.95 A 0.05
    C 0.742 G 0.258
    G 0.875 T 0.125
    T 0.167 C 0.833
    C 0.792 A 0.208
    Columns one and three are alphabets while three and four are their
    corresponding values.
    I wanted to sort this data so that my first and third columns are in
    alphabetic order. For example in the first row the order is "G" then "A".
    This is not in alphabetic order therefore we swap them along with their
    values and it becomes:
    A 0.108 G 0.892
    Row two looks fine but row three needs the same rearrangement as row one.
    And the final out put looks like:
    A 0.108 G 0.892
    G 0.883 T 0.117
    C 0.5 T 0.5
    A 0.617 G 0.383
    A 0.075 G 0.925
    A 0.967 G 0.033
    C 0.883 T 0.117
    C 0.633 T 0.367
    A 0.05 G 0.95
    C 0.742 G 0.258
    G 0.875 T 0.125
    C 0.833 T 0.167
    A 0.208 C 0.792
    Please some help with the relevant command names or a technique to code this
    task.
    Thank you in advance
    Regards Hannes
    [[alternative HTML version deleted]]
    R-help (AT) stat (DOT) math.ethz.ch mailing list
    PLEASE do read the posting guide!
  • No.1 | | 3057 bytes | |

    This 'applies' a function to each row to reverse columns if necessary:

    x
    V1 V2 V3 V4
    1 G 0.892 A 0.108
    2 G 0.883 T 0.117
    3 T 0.500 C 0.500
    4 A 0.617 G 0.383
    5 G 0.925 A 0.075
    6 A 0.967 G 0.033
    7 C 0.883 T 0.117
    8 C 0.633 T 0.367
    9 G 0.950 A 0.050
    10 C 0.742 G 0.258
    11 G 0.875 T 0.125
    12 T 0.167 C 0.833
    13 C 0.792 A 0.208
    str(x)
    `data.frame': 13 obs. of 4 variables:
    $ V1: chr "G" "G" "T" "A"
    $ V2: num 0.892 0.883 0.5 0.617 0.925 0.967 0.883 0.633 0.95 0.742
    $ V3: chr "A" "T" "C" "G"
    $ V4: num 0.108 0.117 0.5 0.383 0.075 0.033 0.117 0.367 0.05 0.258

    t(apply(x, 1, function(z){
    + if (z[1] < z[3]) return(z)
    + else return(z[c(3,4,1,2)])
    + }))
    [,1] [,2] [,3] [,4]
    1 "A" "0.108" "G" "0.892"
    2 "G" "0.883" "T" "0.117"
    3 "C" "0.500" "T" "0.500"
    4 "A" "0.617" "G" "0.383"
    5 "A" "0.075" "G" "0.925"
    6 "A" "0.967" "G" "0.033"
    7 "C" "0.883" "T" "0.117"
    8 "C" "0.633" "T" "0.367"
    9 "A" "0.050" "G" "0.950"
    10 "C" "0.742" "G" "0.258"
    11 "G" "0.875" "T" "0.125"
    12 "C" "0.833" "T" "0.167"
    13 "A" "0.208" "C" "0.792"
    --

    6/14/06, yohannes alazar <hannesalazar (AT) gmail (DOT) comwrote:

    First I would really like to thank the mailing list for help I got in the
    past, as a new to R I am really needing some support on hoe to code the
    following problem.
    >
    >
    >

    I am trying to sort some data I have in a big file. The file has 4 columns
    and 19000 rows. An example of it looks like this:-
    >
    >
    >

    G 0.892 A 0.108

    G 0.883 T 0.117

    T 0.5 C 0.5

    A 0.617 G 0.383

    G 0.925 A 0.075

    A 0.967 G 0.033

    C 0.883 T 0.117

    C 0.633 T 0.367

    G 0.95 A 0.05

    C 0.742 G 0.258

    G 0.875 T 0.125

    T 0.167 C 0.833

    C 0.792 A 0.208
    >
    >
    >

    Columns one and three are alphabets while three and four are their
    corresponding values.

    I wanted to sort this data so that my first and third columns are in
    alphabetic order. For example in the first row the order is "G" then "A".
    This is not in alphabetic order therefore we swap them along with their
    values and it becomes:

    A 0.108 G 0.892

    Row two looks fine but row three needs the same rearrangement as row one.
    And the final out put looks like:

    A 0.108 G 0.892

    G 0.883 T 0.117

    C 0.5 T 0.5

    A 0.617 G 0.383

    A 0.075 G 0.925

    A 0.967 G 0.033

    C 0.883 T 0.117

    C 0.633 T 0.367

    A 0.05 G 0.95

    C 0.742 G 0.258

    G 0.875 T 0.125

    C 0.833 T 0.167

    A 0.208 C 0.792

    Please some help with the relevant command names or a technique to code
    this
    task.

    Thank you in advance

    Regards Hannes

    [[alternative HTML version deleted]]

    R-help (AT) stat (DOT) math.ethz.ch mailing list

    PLEASE do read the posting guide!
  • No.2 | | 2214 bytes | |

    If your df contains your data, try

    tmp <- cbind( paste(df[ ,1], df[ ,2], sep=":"),
    paste(df[ ,3], df[ ,4], sep=":") )
    tmp <- t( apply(tmp, 1, sort) )

    out <- data.frame( do.call(rbind, strsplit( tmp[,1], split=":" )),
    do.call(rbind, strsplit( tmp[,2], split=":" )) )
    colnames(out) <- colnames(df)
    out

    Regards, Adai

    Wed, 2006-06-14 at 16:35 +0100, yohannes alazar wrote:
    First I would really like to thank the mailing list for help I got in the
    past, as a new to R I am really needing some support on hoe to code the
    following problem.

    I am trying to sort some data I have in a big file. The file has 4 columns
    and 19000 rows. An example of it looks like this:-

    G 0.892 A 0.108

    G 0.883 T 0.117

    T 0.5 C 0.5

    A 0.617 G 0.383

    G 0.925 A 0.075

    A 0.967 G 0.033

    C 0.883 T 0.117

    C 0.633 T 0.367

    G 0.95 A 0.05

    C 0.742 G 0.258

    G 0.875 T 0.125

    T 0.167 C 0.833

    C 0.792 A 0.208

    Columns one and three are alphabets while three and four are their
    corresponding values.

    I wanted to sort this data so that my first and third columns are in
    alphabetic order. For example in the first row the order is "G" then "A".
    This is not in alphabetic order therefore we swap them along with their
    values and it becomes:

    A 0.108 G 0.892

    Row two looks fine but row three needs the same rearrangement as row one.
    And the final out put looks like:

    A 0.108 G 0.892

    G 0.883 T 0.117

    C 0.5 T 0.5

    A 0.617 G 0.383

    A 0.075 G 0.925

    A 0.967 G 0.033

    C 0.883 T 0.117

    C 0.633 T 0.367

    A 0.05 G 0.95

    C 0.742 G 0.258

    G 0.875 T 0.125

    C 0.833 T 0.167

    A 0.208 C 0.792

    Please some help with the relevant command names or a technique to code this
    task.

    Thank you in advance

    Regards Hannes

    [[alternative HTML version deleted]]

    R-help (AT) stat (DOT) math.ethz.ch mailing list

    PLEASE do read the posting guide!

    R-help (AT) stat (DOT) math.ethz.ch mailing list

    PLEASE do read the posting guide!
  • No.3 | | 2659 bytes | |

    Thank you very much this does the work exactlly.
    regards Hannes

    6/15/06, Adaikalavan Ramasamy <ramasamy (AT) cancer (DOT) org.ukwrote:

    If your df contains your data, try

    tmp <- cbind( paste(df[ ,1], df[ ,2], sep=":"),
    paste(df[ ,3], df[ ,4], sep=":") )
    tmp <- t( apply(tmp, 1, sort) )

    out <- data.frame( do.call(rbind, strsplit( tmp[,1], split=":" )),
    do.call(rbind, strsplit( tmp[,2], split=":" )) )
    colnames(out) <- colnames(df)
    out

    Regards, Adai
    >
    >
    >

    Wed, 2006-06-14 at 16:35 +0100, yohannes alazar wrote:
    First I would really like to thank the mailing list for help I got in
    the
    past, as a new to R I am really needing some support on hoe to code the
    following problem.
    >
    >
    >

    I am trying to sort some data I have in a big file. The file has 4
    columns
    and 19000 rows. An example of it looks like this:-
    >
    >
    >

    G 0.892 A 0.108

    G 0.883 T 0.117

    T 0.5 C 0.5

    A 0.617 G 0.383

    G 0.925 A 0.075

    A 0.967 G 0.033

    C 0.883 T 0.117

    C 0.633 T 0.367

    G 0.95 A 0.05

    C 0.742 G 0.258

    G 0.875 T 0.125

    T 0.167 C 0.833

    C 0.792 A 0.208
    >
    >
    >

    Columns one and three are alphabets while three and four are their
    corresponding values.

    I wanted to sort this data so that my first and third columns are in
    alphabetic order. For example in the first row the order is "G" then
    "A".
    This is not in alphabetic order therefore we swap them along with their
    values and it becomes:

    A 0.108 G 0.892

    Row two looks fine but row three needs the same rearrangement as row
    one.
    And the final out put looks like:

    A 0.108 G 0.892

    G 0.883 T 0.117

    C 0.5 T 0.5

    A 0.617 G 0.383

    A 0.075 G 0.925

    A 0.967 G 0.033

    C 0.883 T 0.117

    C 0.633 T 0.367

    A 0.05 G 0.95

    C 0.742 G 0.258

    G 0.875 T 0.125

    C 0.833 T 0.167

    A 0.208 C 0.792

    Please some help with the relevant command names or a technique to code
    this
    task.

    Thank you in advance

    Regards Hannes

    [[alternative HTML version deleted]]

    R-help (AT) stat (DOT) math.ethz.ch mailing list

    PLEASE do read the posting guide!

    >
    >
    >


    [[alternative HTML version deleted]]

    R-help (AT) stat (DOT) math.ethz.ch mailing list

    PLEASE do read the posting guide!

Re: data managment


max 4000 letters.
Your nickname that display:
In order to stop the spam: 0 + 9 =
QUESTION ON "Development"

EMSDN.COM