Open Source Software

  1. H2O.ai, creator of applications for making machine learning accessible to business users, has introduced a product intended to allow business users familiar with products like Tableau to extract insights from data without needing expertise in deploying or tuning machine learning models.

    Driverless AI, currently in beta, is billed by H2O.ai as an “expert system for AI” — a way to automate the kinds of expertise that data scientists bring to developing machine learning models. The target audience is non-expert users, who can take datasets and run GPU-accelerated ML algorithms against them to extract useful results, without understanding the ins and outs of data science.

    To read this article in full or to leave a comment, please click here

  2. The Open Container Initiative, a consortium founded to develop open standards around Docker-style containers across platforms, has delivered 1.0 milestones for two crucial specifications under its banner.

    The new standards aren't likely to affect the way developers work with containers. The real impact is likely to be felt by commercial producers of container-related products, especially if they are angling to have OCI certification applied to what they produce.

    OCI's newly finalized standards cover two key components of the container ecosystem -- the image format for containers, and the runtime specification. The OCI Image Format, as the first is formally called, is easy enough to grasp. It describes the way a container image is laid out internally and what its various components are.

    To read this article in full or to leave a comment, please click here

  3. Russian search engine creator Yandex has joined the ranks of Google, Amazon, and Microsoft by releasing its own open source machine learning library, CatBoost.

    The Apache-licensed CatBoost is for “open-source gradient boosting on decision trees,” according to its GitHub repository’s README. It provides a way to perform classifications and rankings of data by using a collection of decision-making mechanisms, or “learners,” rather than a single one. Results generated by the learners are weighted and classified based on the strengths and weaknesses of each learner. By combining many learners, CatBoost can yield better results than decision-making systems that rely on individual learners.

    To read this article in full or to leave a comment, please click here

  4. The official Go blog has provided the first concrete details about the next version of Google’s Go language, which is used to create popular applications like Docker and Kubernetes, as well as to incrementally replace critical internet infrastructure.

    But Go developers waiting for immediate word about generics, or other pet features they’ve long been waiting to see added to the language, are going to walk away disappointed.

    The post, written by Go architect Russ Cox, details how the chief goal for Go 2 is “to fix the most significant ways Go fails to scale.” By “scale,” Cox is referring to both production and development. The former is about “concurrent systems interacting with many other servers, exemplified today by cloud software,” and the latter is about “large codebases worked on by many engineers coordinating only loosely, exemplified today by modern open-source development.”

    To read this article in full or to leave a comment, please click here

  5. With version 2.2 of Apache Spark, a long-awaited feature for the multipurpose in-memory data processing framework is now available for production use.

    Structured Streaming, as that feature is called, allows Spark to process streams of data in ways that are native to Spark's batch-based data-handling metaphors. It's part of Spark's long-term push to become, if not all things to all people in data science, then at least the best thing for most of them.

    To read this article in full or to leave a comment, please click here

Go to top