Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-19611

Spark Connector -- Compressed files and error indications

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.0.0
    • Component/s: Embedded Languages
    • Labels:
      None
    • Environment:
      Latest platform build

      Description

      Fails with out-of-memory when running the following program (note that the filename used is in the program below):

      import org.hpccsystems.spark.HpccFile
      import org.hpccsystems.spark.thor.RemapInfo
      import org.apache.spark.sql._//Dataset
      import org.apache.spark.ml.feature.VectorAssembler
      import org.apache.spark.ml.classification.LogisticRegression
      import org.apache.spark.mllib.evaluation.MulticlassMetrics

      val ri = new RemapInfo(20, "10.240.37.108")
      // Parameters are: <thorFileName>, <protocol>, <service address>, <service port>, <userid>, <password>, <remapInfo(optional)>
      //dashboard file
      val hpcc = new HpccFile("~sg::test::thor::dashboard::distributed::logfiles", "http", "10.240.37.76", "8010", "","", ri )
      val my_df = hpcc.getDataframe(spark)
      my_df.printSchema()
      my_df.show()

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                johnholt John Holt
                Reporter:
                rdev Roger Dev
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: