When working with social media (or data collections with limited and contidional access), their own Terms of Service (ToS) must be observed in addition to other legal concerns and frameworks.
The Social Media providers are privately owned companies, that determine their own sets of rules for using the service, as well as the extent to which data may be extracted and how they may be used.
The general Terms of Service that apply to users must be observed for anyone using a social service. These rules typically determine ownership and codes of conduct. Failing to observe them can result in the user's account being temporarily or permanently banned and/or deleted.
Some social media services offer an extended access via a researcher or developer account. Such an extended access may allow for data harvesting, but only on the conditions set out in the Terms of Service for Developers (or researchers). These terms serve as an agreement between the researcher and the service in question, wherefore observing and adhering to them is crucial.
One example of this is Twitter, which may serve to illustrate the importance of close reading the Terms of service:
With a developer account a researcher may harvest tweets on a given subject. However the resulting data collection may not be shared or made public "as is". It is only allowed to share the Tweet IDs.
Therefore: Whether one is using an existing dataset, or publicising one, only the Tweet IDs may be readily available in the dataset.
When using an existing dataset, researchers may use the application Hydrator to (re)obtain the content of the tweets from the Tweet IDs. Limited use of "rehydrated" tweets is permitted in recearch publications, so that quoting may be used to a limited extent.
Twitter terms of Service (user): https://twitter.com/en/tos
Twitter Terms of Service (developer): https://developer.twitter.com/en/developer-terms/agreement
Terms of Service (developer) subpage on Tweet IDs: https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases