Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites

Want to tap the tremendous amount of valuable social data in Facebook, Twitter, LinkedIn, and Google+? This refreshed edition helps you discover who’s making connections with social media, what they’re talking about, and where they’re located. You’ll learn how to combine social web data, analysis techniques, and visualization to find what you’ve been looking for in the social haystack—as well as useful information you didn’t know existed.

Each standalone chapter introduces techniques for mining data in different areas of the social Web, including blogs and email. All you need to get started is a programming background and a willingness to learn basic Python tools.

  • Get a straightforward synopsis of the social web landscape
  • Use adaptable scripts on GitHub to harvest data from social network APIs such as Twitter, Facebook, LinkedIn, and Google+
  • Learn how to employ easy-to-use Python tools to slice and dice the data you collect
  • Explore social connections in microformats with the XHTML Friends Network
  • Apply advanced mining techniques such as TF-IDF, cosine similarity, collocation analysis, document summarization, and clique detection
  • Build interactive visualizations with web technologies based upon HTML5 and JavaScript toolkits

“A rich, compact, useful, practical introduction to a galaxy of tools, techniques, and theories for exploring structured and unstructured data.”
–Alex Martelli, Senior Staff Engineer, Google

Product Features

  • ISBN13: 9781449388348
  • Condition: New
  • Notes: BRAND NEW FROM PUBLISHER! 100% Satisfaction Guarantee. Tracking provided on most orders. Buy with Confidence! Millions of books sold!

Click Here For More Information

3 Responses to Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites

  1. H. Smith "profhal"

    pure fun Mining the Social Web does a great job of introducing a wide variety of techniques and wealth of resources for exploring freely available social data and personal information. If you are willing to spend the time tinkering with the examples, the book is pure fun. It offers a nice compliment to Segaran’s . The two books overlap but where they do offer different perspectives and explanations of common techniques (e.g., TF-IDF, cosine similarity, Jaccard index). If you are well-versed in data mining the web you may find much of the discussion familiar. If you have only been casually engaged to date, your toolbox will fill quickly.In order to work with the book’s examples related to LinkedIn and Facebook you really need to have a robust collection of connections. In terms of the source code itself, most of it worked as is. I wasn’t able to install the Buzz library which limited my interaction with material in chapter 7 and opted to not get involved with the LinkedIn or Facebook but found the discussions around them easy to follow. By far my favorite chapter in the book was chapter 8, “Blogs et al.: Natural Language Processing (and Beyond)…” It was quite fascinating and caused my reading list to grow considerably.

  2. Ricardo Bánffy

    A book that covers an awesome lot of ground This book covers a lot of ground. It’s, at times, a bit vertiginous in the amount of subjects and technologies it touches per chapter, and is not always easy to follow. It can also introduce so many interesting things that, by the time you finished becoming familiar with all of them, after wandering for hours on the web, jumping from interesting technology to interesting technology, you may have forgotten what took you to these places and wonder where you were in the book. Time spent reading it is, however, time very well spent. When you finish it, you will have at least a cursory familiarity with tools like OAuth, CouchDB, Redis, MapReduce, NumPy (and the Python programming language, albeit it will help you a lot if you know your way around Python before you start the book), Graphviz, SIMILE widgets, NLTK, various service APIs and data formats, and will be well equipped to explore those rich datasets on your own. The chapters are well compartmentalized and it’s easy to pick chapters to read according to your needs. I know that, when I face the problems they tackle, I will do exactly that.If you do any kind of analysis and visualization of social-generated data that’s on the web, this book is a good pick. Even if your datasets are not from the web, you may find the parts on analysis and visualization very interesting.

  3. Easy to read. I tore through it Some basic programming ability is a must for this book, as the first page starts with installing the Python development tools. If you don’t know Python, that is okay since all the code is easy to follow. Everything you need to develop and run the examples is described step by step with clear instructions at every point.Once you get comfortable with the basics, the author quickly moves from topic to topic, giving a good introduction into many aspects of how to mine data and generate useful conclusions. Some of the examples includeaccessing your twitter feed with OAuth,processing feeds to determine influence,using set-wise opeations with redis to determine which of your friends are also followers,storing data in CouchDB,using map-reduce to determine the most popular mentions and topics,natural language processing,and seeing data with various visualization tools.And that was just for Twitter.The book continues on with examples of processing mailboxes, LinkedIn, Google Buzz, blogs, Facebook, and the Semantic Web. The examples show how easy it is to gather and analyze data from all these social web sites.With a good breadth of coverage, I highly recommend this book for anyone wanting to learn to process and visualize large amounts of data, either from the social web or any other data source.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>