ZHIYUAN (ZOE) LIN
  • Home
  • Academic Notes
  • Thoughts
  • Chinese

BIDMat - HDFSIO

11/19/2015

0 Comments

 
To fully understand Hadoop/Spark IO, you should better to first understand "Sequence File" and "Serializable".
I have been confused for a while though....and  decide to write this post.
Java OOP Review: Inheritance, Hierarchy for package java.lang

Read More
0 Comments

Debugging: Using (scala) CUDA in Spark(2) - (ubuntu14.04)

11/10/2015

0 Comments

 
It was a pain to get Jcuda&scala worked on Spark, in case I (as well as someone else) need to install them later, I will try my best to recall most of the harmful errors.  (it's not hard to solve many small bugs just by googling them, and I won't mention them)

Read More
0 Comments

Using (scala) CUDA in Spark(1) -- based on JCuda

11/9/2015

1 Comment

 
Many people want to leverage CUDA for some scala (machine learning) code. But cuda doesn't support scala TAT. 
Hope never ends! We can always try the following approaches:

Read More
1 Comment

Is CPU the new bottleneck? - Tungsten (Spark DataFrame)

11/3/2015

0 Comments

 
Performance optimization is a never ending process. Project Tungsten will be the largest change to Spark's execution engine since the project's inception. It aims at substantially improving the efficiency of memory and CPU for Spark applications, to push performance closer to the limits of modern hardware.
Why Cpu is the main bottleneck instead IO: 1. Hardware has been improved. 2.Spark's IO has been optimized. 3.Data Formats have improved. 4. Serialization and hashing are CPU-bound bottlenecks.
​
Three initiatives:

Read More
0 Comments

Matrix Operations in Spark MLlib

11/1/2015

0 Comments

 
MLlib supports local vectors and matrices stored on a single machine, as well as distributed matrices backed by one or more RDDs. Local vectors and local matrices are simple data models that serve as public interfaces. The underlying linear algebra operations are provided by Breeze and jblas. 
Related Data Types:

Read More
0 Comments

    Categories

    All
    Artificial Intelligence
    GPU Programming
    Paper Review
    Scala
    Spark

    Archives

    December 2015
    November 2015
    October 2015
    September 2015
    August 2015

    RSS Feed

                                                                                                                   © 2015 Zhiyuan Lin Reserved