Data Lake

Data Lake
Thinking about deep machine learning, how ready is your data (cleaned, prepped and labeled) for compute-heavy data models?

Top Answer : Probably depends on age and size of business. Ours has divisions over 100 years old or acquired less than 2 years ago. We struggle with data quality even within a division. Note that our smallest is probably 25,000 employees and largest 40,000. Total company 130k people, 15 major ERPs and probably 20 minor ERPs. We still have debates on Customer address and shipping fields sometimes during consolidations. My estimate 35% of the data we would like to use is ready

How do you manage data access/policy management across cloud and 3rd party applications? Include your favorite vendor in the comments.

Top Answer : IAM is a start, but also need multi-cloud SIEM tools that integrate with cloud providers and IAM. I look at brews of Okta, Sailpoint, Sumo Logic, Splunk, and other tools as we build.

Does anyone have a favorite alternative to Tableau? Why?

Top Answer : PowerBI looks more modern, is included within our O365 licensing and appears to have all the same functionality we use/need within Tableau - without the licensing public/server limitations, overhead, hassle, needless complexity. It feels like a part of our ecosystem.

I'm building most on... share "why" in comments...

Top Answer : Virtually nose-to-nose here out the gate, would be great to hear some arguments for/against your choice...  thanks for participating in the poll - I recently joined Pulse so we can put that interview we were planning in my past life on permanent hold ;)  - any additional thoughts here? Thanks!

What are your thoughts on SaaS management platforms (SMP)?

Top Answer :

11 views
0 comments
2 upvotes
Related Tags
Business Application Development
Architecture & Strategy
Maintenance
Requirements & Design
Testing, Deployment & QA
Mobile Development
Development
Selection & Implementation
Business Analysis
Applications Vendor Landscapes
Optimization
Backup
Data Center
Public and Hybrid Cloud
Telephony
Network
Compute
Storage
Business Applications
Cloud
Crisis Management
Data & Business Intelligence
Artificial Intelligence
Business Intelligence Strategy
Data Management
Enterprise Integration
Integrations
Machine Learning
Governance
Data Lake
Big Data
Data Warehouse
Disruptive & Emerging Technologies
5G
Blockchain
Cryptocurrencies
Virtual Reality
IoT
Reality
Digital Innovation
Bots
Augmented Reality
End-User Services & Collaboration
Collaboration solutions
End User Equipment
End-User Computing Devices
Endpoint management
Productivity tools
Document Management
End-User Computing Applications
End-User Computing Strategy
Mobile
Voice & Video Management
Continuous Integration
Technical Product Management
DevOps
Continuous Deployment
Development
Quality Assurance
Customer Relationship Management
Enterprise Content Management
Customer Success
Enterprise Information Management
Finance
Enterprise Resource Planning
HR
Legal
Marketing Solutions
Retail
Human Resource Systems
Marketing
Product Recommendation
Sales
Risk Management
GDPR
SOX Compliance
Governance, Risk & Compliance
Infrastructure & Operations
Cloud Strategy
I&O Finance & Budgeting
Operations Management
Network Management
DR and Business Continuity
Server Optimization
Leadership
Attract & Select
Cost & Budget Management
Engage
Culture
Manage Business Relationships
Innovation
Organizational Design
Program & Project Management
Train & Develop
Values
Talent management
Performance Measurement
Organization Structure
Manage & Coach
Availability Management
Financial and Vendor Management
Reporting
Service Desk
Management Tools
Enterprise Service Management
People & Process
Process Management
Asset Management
Project & Portfolio Management
Portfolio Management
Project Management Office
Pulse
Security
Confidentiality, Integrity, Availability
Secure Cloud & Network Architecture
Endpoint Security
Data Privacy
Identity and Access Management
Security Operations Center
Security Strategy & Budgeting
Security Vendor Landscapes
Threat Intelligence & Incident Response
Threat & Vulnerability Management
Vendor Management
Infrastructure Vendor Landscapes
Budgeting
Roadmap
Outsourcing
Strategy & Operating Model
Business Continuity
Architecture Domains
Strategy
Tool Recommendation
Who’s Responsible for loading raw data into your Data Lake?

Top Answer : Data owners are. While the Data Lake or Enterprise Platform Team provides the tools, processes, and constraints for the upload, no one knows understands the data better than the data owners. Based on their understanding of the data, data owners select the most relevant and applicable data, make sure that the data is clean and meets the quality requirements, and prepare the final data set to be uploaded. Then they load the data using the tools and processes provided by the platform teams.

4 views
1 comments
1 upvotes
Related Tags
Are data lakes a good way to manage big data?

Top Answer : Yes, however only if they are created and managed properly, otherwise it becomes a data warehouse

36 views
5 comments
3 upvotes
Related Tags
How should we approach the problem of excessive data collection?

Top Answer : I started the idea of multi-tenancy for IoT, realistically multi-tenancy for data back in 2016. The basic idea is, we need to find the right way to get the maximum value out of the infrastructure we're building, and thereby not create even more sets of data about the same stuff. I have no idea if this is even possible, but I've used a similar model for infrastructure design and build in the past. What if you could work with manufacturers from an application standpoint to define data value prioritization and retention models that applied to specific operational environments like shop floor or manufacturing machines, to where you could apply a policy that could be defined for you. While it sounds great, the reason I think it would never work is that there's never been a time where somebody has said, "Well, can you be 100% certain that I'll never want to go back and look at that data?