Wednesday, February 1, 2012

How I dumped profile pics of first 10000 Facebook users within few hrs.

A clip form the movie "The Social Network"

Hi all,In this article I am going to tell you guys how I have downloaded profile / Cover picture of first ten thousand Facebook users within few hours using a python script of near about 100 lines. Here I have used Facebook graph api and An html comment present in profile page of Facebook(You will get to know more about this later on).

So what is Facebook graph api?

Using Facebook graph api you can retrieve  few profile information of a Facebook user, like profile id ,First Name Last Name,Facebook username ,user’s gender and locale.
To get this information only thing you have to do is access following url.<target profile id>

Just replace the id parameter with your own. One important thing is, the api returns false if the id is not valid. For example if you try to access id=1 the api will return false because that is not a valid facebook id. But if you change the parameter to 4 you can see the api will return above mentioned information of Mark Zuckerberg. Using this graph api I am going to check if the target profile ID is valid or not. You might think why I have used this api. This is true that same thing can be done by accessing,2,3 bluh bluh … like this. My answer is ..Light weight of this api. So you don’t have to craft each and every http headers to check for valid profile id.

Another feature of graph is getting like and share counts of any link .Graph api returns the count of how many times a link is shared or liked on Facebook through JSON. You can do it in this way

Another thing you can do with Facebook graph api is Block Detection. If a user tries to access any invalid profile (For example application takes the user to a page like “The page you requested was not found.”If any user is blocked by someone then also the application does the same. Using graph api one user can easily understand if you are blocked by someone or not.

An interesting html comment:

If you look at the source of any profile page of a fb user when you are logged in, then you can find that Facebook returns the actual image location of profile/cover pictures through an html comment.

For example Mark Zukerberg’s Facebook profile is We can find the image location of his current cover picture by inspecting it which is

Looking at the source code of his profile page I have found that the application is disclosing this image path( an html comment like :

<!-- <div class="fbTimelineTopSectionBase"><div id="pagelet_above_header_timeline" data-referrer="pagelet_above_header_timeline"></div><div id="above_header_timeline_placeholder"></div><div class="fbTimelineSection mtm fbTimelineTopSection"><div id="fbProfileCover"><div class="cover" style="margin-top: -115px;" data-collapse="115"><a class="coverWrap coverImage" href=";set=a.941146602501.2418915.4&amp;type=1" rel="theater" id="fbCoverImageContainer"><img class="photo img" src="" style="top:0px;width:100%;" data-fbid="989690200741" alt="Cover Photo" /> -referrer="pagelet_timeline_nav"></div><div id="pagelet_above_header_not_timeline" data-referrer="pagelet_above_header_not_timeline"></div></div></div><div id="timeline_tab_content"><div id="pagelet_escape_hatch" data-referrer="pagelet_escape_hatch"></div><div id="pagelet_timeline_recent" data-referrer="pagelet_timeline_recent"></div></div> -->

One important thing about this is, The application does not return the html comment line if you are not logged in Facebook.

So using this html comment it’s become much easier to mass download Facebook users profile/cover picture.

My strategy to achieve the target was

1)    Choose any random profile id.
2)    Using graph API verify if the id is valid or not. If the id not valid, server will return “false”. If the ID is valid the server will return some information like name profile id, gender location etc.

3)    If the id valid I will send an http request to with all necessary http headers. 
For example :

Now Then the server will redirect us to the actual profile location. Now From location http header in server response I will get the actual profile location.

4)    Now I will craft another http request with a valid session cookie and other mandatory http headers and request the profile page of the target user. Then the server will return client side codes of that user's profile page.

5)    After grabbing the client side code, As the Facebook application returns the actual image location of profile picture or cover picture through an html comment the image url can be easily extracted from the page using simple regular expression.

6)    After getting the Image url its very easy to download the picture. 

I have written this python script to automate the above mention process

Abusing graph api may not be a very big deal but I have informed Facebook about this html comment present in profile page and shared this exploit code with them. According to them cover / profile pictures must be public and so that html comment line does not have any direct impact on Facebook application. But interestingly after getting reply mail form them, the above mentioned script stopped working like before. Most probably they have implemented any anti automation techniques or something smiler to that to prevent this.