大数据数据量估算_如何估算数据科学项目的数据收集成本

大数据数据量估算_如何估算数据科学项目的数据收集成本大数据数据量估算 Notes All opinions are my own 注 所有观点均为我自己 介绍 Introduction Data collection is the initial and fundamental step in any Data Science or Analytics project

大家好,我是讯享网,很高兴认识大家。

大数据数据量估算

(Notes: All opinions are my own)

(注:所有观点均为我自己)

介绍 (Introduction)

Data collection is the initial and fundamental step in any Data Science or Analytics project, and on which all following activities rely, from data analysis to model deployment.

数据收集是任何数据科学或Analytics(分析)项目中的第一步,也是基础步骤,从数据分析到模型部署,所有后续活动都依赖于此。

With the pervasive presence of APIs and Cloud Computing, I am ever more intrigued in maximizing the efficiency and level of automation of data collection activities for both work and personal projects.

随着API和云计算的普遍存在,我对将工作和个人项目的数据收集活动的效率和自动化水平最大化实现了极大的兴趣。

In the latter category, I have been interested in collecting data from online home-rental platforms in the UK market (Zoopla, RightMove, OnTheMarket, and similar) with the aim of extracting image and text data to be processed for use in machine learning models (for use cases such as prediction of a property’s price, extraction of key features from image-data to infer a listing’s true value, processing of customer reviews through NLP techniques, etc..)

在后一类中,我感兴趣的是从英国市场( ZooplaRightMoveOnTheMarket等)的在线家庭租赁平台收集数据,目的是提取要处理的图像和文本数据,以用于机器学习模型。 (对于用例,例如预测房地产价格,从图像数据中提取关键特征以推断出房源的真实价值,通过NLP技术处理客户评论等)。

In the following lines, I aim to discuss how to potentially go about:


讯享网

在下面的几行中,我旨在讨论如何实现:

  1. The identification of the most critical data sources

    识别最关键的数据源

  2. The estimation of data collection costs should you want to put your solution to commercial use

    如果您要将解决方案投入商业使用,则需要估算数据收集成本

I gave the article a broader cut, which touches upon market and regulatory considerations to be made when reasoning around data collection for potentially commercial purposes, as well as the more technical considerations of working with APIs, as I realize there are multiple layers to be surfaced within this very interesting topic.

我对文章进行了更广泛的介绍,其中涉及了出于潜在商业目的而进行数据收集推理时要考虑的市场和监管方面的考虑,以及涉及API的更多技术方面的考虑,因为我意识到要浮出水面在这个非常有趣的话题中。

I hope the below key points will result useful in setting up the Data Collection block of your current and future Data Science projects, no matter your industry focus.

我希望以下要点将有助于您建立当前和将来的数据科学项目的数据收集模块,无论您关注的是行业如何。

小讯
上一篇 2025-03-26 23:07
下一篇 2025-02-26 21:29

相关推荐

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/121201.html