technology

How to Fetch Data From Instagram Using Python

Instagram is one of the most popular social media sites with billions of users. Everyone from students to celebrities has Instagram accounts. The public data from Instagram can be of immense value to businesses, marketers, and individuals. Anyone can use this data to perform data analysis, target marketing, and generate insights.


You can use Python to build an automated tool that extracts Instagram data.


Installing Required Libraries

Instaloader is a Python library you can use to extract publicly available data from Instagram. You can access data like images, videos, username, no. of posts, followers count, following count, bio, etc. using Instaloader. Note that Instaloader is not affiliated with, authorized, maintained, or endorsed by Instagram in any way.

To install instaloader via pip, run the following command:

pip install instaloader

You must have pip installed on your system to install external Python libraries.

Next, you need to install the Pandas Python library. Pandas is a Python library that’s mainly used to perform data manipulation and data analysis. Run the following command to install it:

pip install pandas

Now, you’re ready to begin setting up the code and fetching the data out of Instagram.

Setting Up Your Code

To set up the Instagram data fetching tool, you need to import the Instaloader Python library and create an instance of the Instaloader class. After that, you need to provide the Instagram handle of the profile from which you want to extract the data.

The Instagram Extractor Python code is available in a GitHub repository and is free for you to use under the MIT License.

import instaloader


bot = instaloader.Instaloader()


profile = instaloader.Profile.from_username(bot.context, 'cristiano')
print(profile)

This is a good first step to check the basics work. You should see some meaningful data with no errors:

You can extract valuable publically available data like username, no. of posts, followers count, following count, bio, user ID, and external URL using Instaloader with just a few lines of code. You only need to provide the Instagram handle of the profile.

import instaloader
import pandas as pd


bot = instaloader.Instaloader()


profile = instaloader.Profile.from_username(bot.context, 'leomessi')
print("Username: ", profile.username)
print("User ID: ", profile.userid)
print("Number of Posts: ", profile.mediacount)
print("Followers Count: ", profile.followers)
print("Following Count: ", profile.followees)
print("Bio: ", profile.biography)
print("External URL: ", profile.external_url)

You should see lots of profile information from the handle you specify:

You can extract email addresses from the Insta bio of any profile using regular expressions. You need to import the Python’s king library and pass the regular expression for validating the email as a parameter to the re.findall () method:

import instaloader
import re
bot = instaloader.Instaloader()
profile = instaloader.Profile.from_username(bot.context, "wealth")
print("Username: ", profile.username)
print("Bio: ", profile.biography)
emails = re.findall(r"b[A-Za-z0-9._%+-][email protected][A-Za-z0-9.-]+.[A-Z|a-z]{2,}b", profile.biography)
print("Emails extracted from the bio:")
print(emails)

The script will print anything it recognizes as an email address in the bio:

When you search for anything on Instagram, you get several results including usernames and hashtags. You can extract the top search results using the get_profiles () and get_hashtags () methods. You only need to provide the search query in the instaloader.TopSearchResults () method. Further, you can iterate and print / store the individual results.

import instaloader


bot = instaloader.Instaloader()


search_results = instaloader.TopSearchResults(bot.context, 'music')


for username in search_results.get_profiles():
print(username)


for hashtag in search_results.get_hashtags():
print(hashtag)

The output will include any matching usernames and hashtags:

You can extract the followers of an account, and those that it follows itself, using Instaloader. You’ll need to provide an Instagram username and password to retrieve this data.

Never use your personal accounts to extract data from Instagram as it may get your account temporarily or permanently banned.

After creating an instance of the Instaloader class, you need to provide your username and password. This is so that the bot can log in to Instagram using your account and fetch the followers and followings data.

Next, you need to provide the Instagram handle of the target profile. The get_followers () and get_followees () methods extract the followers and followees. You can get the followers ‘and followees’ usernames using the follower.username and followee.username properties respectively.

If you want to store the results in a CSV file, you first need to convert the data into a Pandas DataFrame object. Use the pd.DataFrame () method to convert a list object into a DataFrame.

Finally, you can export the DataFrame object to a CSV file using the to_csv () method. You need to pass the filename.csv as a parameter to this method to get the exported data in the CSV file format.

Only the account owners can see all the followers and followings. You will not be able to extract all the followers and followings data using this or any other method.


import instaloader
import pandas as pd


bot = instaloader.Instaloader()
bot.login(user="Your_username", passwd="Your_password")


profile = instaloader.Profile.from_username(bot.context, 'Your_target_account_insta_handle')


followers = [follower.username for follower in profile.get_followers()]


followers_df = pd.DataFrame(followers)


followers_df.to_csv('followers.csv', index=False)


followings = [followee.username for followee in profile.get_followees()]


followings_df = pd.DataFrame(followings)


followings_df.to_csv('followings.csv', index=False)

Download Posts From an Instagram Account

Again, to download posts from any account, you’ll need to provide a username and password. This is so the bot can log in to Instagram using your account. You can retrieve all the posts’ data using the get_posts () method. And you can iterate and download all the individual posts using the download_post () method.


import instaloader
import pandas as pd


bot = instaloader.Instaloader()
bot.login(user="Your_username",passwd="Your_password")


profile = instaloader.Profile.from_username(bot.context, 'Your_target_account_insta_handle')


posts = profile.get_posts()


for index, post in enumerate(posts, 1):
bot.download_post(post, target=f"{profile.username}_{index}")

Scrape the Web Using Python

Data scraping or web scraping is one of the most common ways to extract useful information from the web. You can use the data you extract for marketing, content creation, or decision-making.

Python is the preferred language for data scraping. Libraries like BeautifulSoup, Scrapy, and Pandas simplify data extraction, analysis, and visualization.

Related Articles

Back to top button