Pages

Showing posts with label web2.0. Show all posts
Showing posts with label web2.0. Show all posts

Saturday, August 3, 2013

Daily Blog #40: Web 2.0 Forensic Part 5

Hello Reader,
                    In the past posts in this series we've focused on what you can recover from web 2.0 sites, how data sits on the disk and how data is transmitted across the network. In this post we talk about what these messages fields mean and how to build a quick carver for them. Tomorrow is Saturday Reading and I will be including a link to today's Forensic Lunch cast which i think was the best so far!

Mail folder summary view versus Mail folder full view:
What I noticed in viewing the data as it went across the network is that there are two distinct types of data streams being sent, at least to chrome. The first being the page of the mailbox you requested which contains the message summaries as well as the message contents themselves. The second being additional pages of the mail folder being viewed where only the message summaries are being sent and cached for faster loading to the user.

The full view is the first page sent and contains data in two sections, the first is the message summary for example here is a message summary for my daily win4n6 mailing list digest:

,["cs","140395ee6229f7d4","140395ee6229f7d4",1,,,1375366638336000,"140395ee6229f7d4",["140395ee6229f7d4"]
,[]
,[]
,[["140395ee6229f7d4",["^all","^i","^smartlabel_group","^unsub"]
]
]
,,,[]
,[["","win4n6@yahoogroups.com"]
,["No Reply","notify-dg-win4n6@yahoogroups.com"]
]
,,,[]
,[]
,,,"Digest Number 1388","[win4n6] Digest Number 1388"]
,

Each section of the inbox view with full messages starts with ["cs" which i'm guessing to mean 'content start' and ends with ,["ce"] as shown below. 
]
,0]
,["ce"]
So we can recover full messages with a regex as simple as 
(\["cs",.+\["ce"\]) 

However this is a greedy expression and may capture multiple messages within it.

Other fields of interest in the header include the message number internally assigned by gmail this can be seen as "140395ee6229f7d4", the message sender "win4n6@yahoogroups.com" and subject ""[win4n6] Digest Number 1388"". 

When the content of the message begins you will see ["ms" which again I can only assume is short for message start as seen below:

["ms","140395ee6229f7d4","",4,"win4n6@yahoogroups.com","","win4n6@yahoogroups.com",1375352053000,"There are 5 messages in this issue. Topics in this digest: 1a. Re: TightVNC F...",["^all","^i","^smartlabel_group","^unsub"]
,0,1,"[win4n6] Digest Number 1388",["140395ee6229f7d4",["win4n6@yahoogroups.com"]
,[]
,[]
If this a mail folder summary view (which I've seen for pages preloaded after the first) this would be the end of content cached and retrievable. If this is the first page of the mail folder then it will be followed with the text of the message itself

,["No Reply \u003cnotify-dg-win4n6@yahoogroups.com\u003e"]
,"[win4n6] Digest Number 1388","There are 5 messages in this issue.\... Huge message digest here removed for readability\n",[[]
,[0]
,"",[]
]
,0,[[]
,[["win4n6","win4n6@yahoogroups.com"]
]
,[]
,[]
,[]
,[]
]
,"Thu, Aug 1, 2013 at 5:14 AM",[]
,1,0,0,0,1,"returns.groups.yahoo.com","yahoogroups.com","","\u003c1375352053.298.19336.m7@yahoogroups.com\u003e","[win4n6] Digest Number 1388","\u003cwin4n6.yahoogroups.com\u003e",,[0]
,,[]
,,0,[0]
,-1,,,[]
,[]
,0,0,1,0,0,,,[]
,,5314,-1]
,,0,"5:14 AM","5:14 am",0,,,"",["en"]
,0,"Thu, Aug 1, 2013 at 5:14 AM",[]
,,,,0,,"win4n6.yahoogroups.com",,0,1,"","win4n6@yahoogroups.com",[[]
,[["win4n6","win4n6@yahoogroups.com"]
]
,[]
,[]
,[]
,[]
]
,-1,,,,"yahoogroups.com",,[]
,[[[2013,7,31,5,37,,0,0]
,,"Wed Jul 31, 2013 5:37 am",0,0,0,0]
,[[2013,7,31,10,6,,0,0]
,,"Wed Jul 31, 2013 10:06 am",0,0,0,1]
,[[2013,7,31,8,28,,0,0]
,,"Wed Jul 31, 2013 8:28 am",0,0,0,3]
,[[2013,7,31,8,42,,0,0]
,,"Wed Jul 31, 2013 8:42 am",0,0,0,4]
,[[2013,7,31,8,50,,0,0]
,,"Wed Jul 31, 2013 8:50 am",0,0,0,6]
]
,0]
,["ce"]
You'll notice there is no matching message end (me) to the message start (ms) as we saw in the cs and ce pairing earlier. Instead the message ends with some index data about the messages in the thread related to this message so it can display them easily and finished with "ce"] again.

For each message retrieved from gmail you'll find these pairings. On Tuesday I'll dig into the javascript that interprets this data to see if we can find more data points for analysis. Until then happy hunting for gmail fragments and I hope you stick around for tomorrow's Saturday reading and Sunday Funday!

Friday, August 2, 2013

Daily Blog #39: Web 2.0 Forensic Part 4

Hello Reader,
      I finally got fiddler installed, its windows only and available here http://fiddler2.com/get-fiddler, and it is much improved over the last time I used it! It even has a ajax and xml decoder built in now which is a pretty huge improvement. In this post we are going to focus on what network data is actually being transmitted between the web client and the web 2.0 web application so you can see the raw data that your browser will be parsing and storing in memory/pagefile/hiberfil. Note that if you want to do this time of testing at home you will need a SSL proxy like fiddler in order to capture the traffic, a network sniffer will just see encrypted traffic.

This is what the request for an inbox view looks like in gmail:
POST https://mail.google.com/mail/u/0/?ui=2&ik=21fc62e736&rid=mail%3Ai.7728.0.1&view=cv&th=1403b5ce42ebf543&th=140395ee6229f7d4&th=1403631f3703e936&th=140344ed98e4eaa3&th=140303866b4ce541&prf=1&_reqid=167197&nsc=1&mb=0&rt=j&search=inbox HTTP/1.1
Host: mail.google.com
Connection: keep-alive
Content-Length: 0
X-Same-Domain: 1
Origin: https://mail.google.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36
Content-Type: application/x-www-form-urlencoded;charset=UTF-8
Accept: */*
X-Chrome-Variations: CM21yQEIhLbJAQiptskBCIaEygEIt4XKAQ==
Referer: https://mail.google.com/_/mail-static/_/js/main/m_i,t,it/rt=h/ver=zDJLUK9Vw_8.en./sv=1/am=!Lt4ru3nDBdL0RMHSG0tdRQM1xOP0KmwcZtPFWYZIAZLMmkQ7GBAA95rDr4ZmlpWnYLsjYcfQ/d=1
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
 Cookie Removed

This the header of the response:
HTTP/1.1 200 OK
Content-Type: text/javascript; charset=UTF-8
Set-Cookie: Cookie Removed 
Domain=mail.google.com; Expires=Thu, 15-Aug-2013 23:39:57 GMT; Path=/mail; Secure; HttpOnly
P3P: CP="This is not a P3P policy! See http://support.google.com/accounts/bin/answer.py?answer=151657 for more info."
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Date: Thu, 01 Aug 2013 23:39:57 GMT
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
Content-Length: 46023
Server: GSE
This is the raw data that is transmitted containing the inbox mail data that you can recover and tools like IEF automatically recover for you:
)]}'
[[["v","zDJLUK9Vw_8.en.","8","dd1cc0830f5f7b2d"]
,["di",710,,,,,[]
,[]
,,,[]
,[]
,[]
]
,["cs","1403b5ce42ebf543","1403b5ce42ebf543",1,,,1375387947676000,"1403b5ce42ebf543",["1403b5ce42ebf543"]
,[]
,[]
,[["1403b5ce42ebf543",["^all","^i","^smartlabel_group","^unsub"]
]
]
,,,[]
,[["","examplegooglegroup@googlegroups.com"]
]
,,,[]
,[]
,,,"Abridged summary of examplegooglegroup@googlegroups.com - 1 Message in 1 Topic","[DFIR] Abridged summary of examplegooglegroup@googlegroups.com - 1 Message in 1 Topic"]
,["ms","1403b5ce42ebf543","",4,"examplegooglegroup@googlegroups.com","","examplegooglegroup@googlegroups.com",1375385478000,"Today's Topic Summary Group: http://groups.google.com/group/examplegooglegroup...",["^all","^i","^smartlabel_group","^unsub"]
,0,1,"[DFIR] Abridged summary of examplegooglegroup@googlegroups.com - 1 Message in 1 Topic",["1403b5ce42ebf543",["Abridged Recipients \u003cexamplegooglegroup@googlegroups.com\u003e"]
,[]
,[]
,["examplegooglegroup@googlegroups.com"]
,"The complete message was located here ",[[]
,[0]
,"",[]
]
,0,[[]
,[["Abridged","examplegooglegroup@googlegroups.com"]
]
After this each message inbox entry and message preview will be listed in sequence and the response ends with:
]
,-1,,,,"google.com",,[]
,[]
,0,""]
,["ce"]
,["e",18,,,45978]
],'fce167f9fb9f05f']

Tomorrow let's talk about whats contained in these fields and what a good regular expression to recover the data, heck maybe a foremost rule to automate the recovery for you. Or you can do what I do and get a copy of IEF rather than try to keep up with all the changes that are made to their data formats.

Thursday, August 1, 2013

Daily Blog #38: Web 2.0 Forensics Part 3

Hello Reader,
        This post is a bit late in the day but that happens sometimes when you are onsite and can't sneak away for some blog writing. In the last two posts we've discussed where to find JSON/AJAX fragments and how Gmail stores message data within them. Today we will discuss how these artifacts are created and what you can and cannot recover from them.

What you can recover
Much like other web artifacts we can only recover what was sent by the server and viewed by the custodian. This includes:

  • the content of emails read
  • the names of contents of attachments accessed
  • what was contained on each mailbox folder viewed (such as the inbox, sent, saved)
    • For some webmail clients (such as gmail) you can also see the a preview of the email messages contained in the mailbox even if they did not read them as the data is precached.
    • Whether the message had been read
    • If the message had an attachment
  • a list of all the mailbox folders the custodian had in use
  • contacts
  • for gmail specifically google talk participants 
  • for gmail specifically a list of all the circles they are in.


What you can't recover
If the data was never sent from the server and viewed it won't be in cached form anywhere except live memory. The list of things you can't recover includes:


  • The text of emails sent from the custodian unless they viewed a preview of the message, checked their sent mail or read a reply to the message. 
  • The content of attachments sent via email, though you can match up the file by name to files on their system as the attachment successful method will be sent from the server to the browser.
  • The full contents of mail folders if all the pages containing messages were not viewed
  • The contents of all webmail read, over time the data will be overwritten in the pagefile and the shadow copies will expire as well as the hiberfil will be overwritten on the next hibernation.

The examples i'm showing here are for webmail, there are other ajax/json services out there (facebook, twitter, etc..) that are popular. I'm focusing on webmail because in my line of work its a popular method for exfiltration of data and discussing plans that they don't want saved in company email. I will see about expanding the series to other types of web 2.0 applications likey after my html 5 offline caching research with Blazer Catzen is complete.

Tomorrow we continue the web 2.0 forensic series, hopefully with an earlier posting time.

Tuesday, July 30, 2013

Daily Blog #37: Web 2.0 Forensics Part 2

Hello Reader,
             Sunday Funday is always fun for me for two reasons. One it gets me two blog posts out of one so I get more time to get work done and two I like getting a general feeling of what level of understanding exists on certain artifacts. So while you get a prize, that I strive to make worth your effort, I get to see what I can continue to help you learn by writing additional blog posts to fill those gaps. With that said we are continuing the web 2.0 series today that I realized was needed from the IEF Sunday Funday challenge two weeks ago.

Json Data Structures

Json data structures are fairly easy to find, they are structure name pairs that are exchanged between the web server and the web client, for instance the Gmail server and the Chrome browser. In this example the Chrome browser would then parse the data to generate the view that you see.

Here is what a message summary from your Gmail inbox looks like:

Index data for gmail
["140303866b4ce541","140303866b4ce541","140303866b4ce541",1,0,["^all","^i","^o","^smartlabel_notification"]
,[]

Email from/subject/message preview and date
,"\u003cspan class\u003d\"yP\" email\u003d\"mail-noreply@google.com\" name\u003d\"Gmail Team\"\u003eGmail Team\u003c/span\u003e","\u0026raquo;\u0026nbsp;","Welcome to the new Gmail inbox","Hi David Meet the new inbox Inbox tabs put you back in control with simple organization so that you",0,"","","10:35 am","Tue, Jul 30, 2013 at 10:35 AM",1375198584460000,,[]
,,0,[]
,,[]
,,"3",[0]
,,"mail-noreply@google.com",,,,0,0]

Here is what a full message loaded and what the email header looks like:








 



 

 




 

   



 

   





    Gmail Team

    <mail-noreply@google.com>

   

 

 















10:35 AM (36 minutes ago)






img class="f T-KT-JX" src="images/cleardot.gif" alt="">
















































to me 
































This is followed by the  body of the message.In addition on each page you have a listing of all the labels, email counts, circles and more data that is preloaded to each page providing you with a large amount of data on your custodians activities but also providing for a large amount of duplicates.

Tomorrow we will go into the important fields and their meanings and I'll provide a regex for carving them out. Recovering webmail used to be simple, just find a javascript library known to the service and carve out the html before and after it, now with JSON/Ajax services like Gmail we get fragments of emails and possibly entire messages but we either have to manually carve them or use a tool like IEF to do it for us.

I start with IEF and let find the fully formed messages and then go back myself to find partials knowing the users email address.

See you tomorrow! Leave comments or questions below if your seeing data differently. I'm going to install fiddler on my system tonight to show how the data looks as its being transmitted.