Scala Learning on Coursera Week3

作者: ShuiLocked | 来源:发表于2016-07-12 15:56 被阅读769次

    Week3: TweetSet Class

    最近一直在跟Coursera上的scala和cloud computing课,干脆开博客记录一下遇到的问题和解决方法。Scala作为函数式编程,如何熟练掌握function的传递,tail recursion的设计是这门课前半部分的重点。这次的作业是设计一个TweetSet,Tweet是一个记录了user,string,Retweeted三个field的class,要求TweetSet以binary tree形式存储,同时满足特定函数。

    Aim: To implement a TweetSet Class, including its fundamental functions.Eg: union, filter(p: Tweet => Boolean), mostRetweeted, descendingByRetweet

    Note:
    filter and mostRetweeted are both implemented using tail recursion(filterAcc and mostAcc)
    注意tail recursion的逻辑,一般都需要传递一个中间结果(1作为累加值或者2作为比较值),从而使下次递归可以直接比较之前完成部分的结果。注意递归顺序,这里因为是binary tree,一般都是先elem,再left,最后right。
    filterAcc records a TweetSet acc that represents the accumulative tweetSet which satisfy the predicate pif (p(elem)) then include current tweet into the tweetSet
    mostAcc records a tweet object with maximum retweets. At beginning, it is initialized as new tweet("ini","ini",0)retweet number is 0 at first!

    Difficulties:
    Remember how to use function to implement traverse of the list!!! Just as follows.
    google.exists(x => y.text.contains(x))

    val google = List("android", "Android", "galaxy", "Galaxy", "nexus", "Nexus")
    val apple = List("ios", "iOS", "iphone", "iPhone", "ipad", "iPad")lazy 
    val googleTweets: TweetSet = TweetReader.allTweets.filter(y => google.exists(x => y.text.contains(x)))lazy 
    val appleTweets: TweetSet = TweetReader.allTweets.filter(y => apple.exists(x => y.text.contains(x)))
    

    ----------Codes-------------

    package objsets
    
    import TweetReader._
    
    /**
     * A class to represent tweets.
     */
    class Tweet(val user: String, val text: String, val retweets: Int) {
      override def toString: String =
        "User: " + user + "\n" +
        "Text: " + text + " [" + retweets + "]"
    }
    
    /**
     * This represents a set of objects of type `Tweet` in the form of a binary search
     * tree. Every branch in the tree has two children (two `TweetSet`s). There is an
     * invariant which always holds: for every branch `b`, all elements in the left
     * subtree are smaller than the tweet at `b`. The elements in the right subtree are
     * larger.
     *
     * Note that the above structure requires us to be able to compare two tweets (we
     * need to be able to say which of two tweets is larger, or if they are equal). In
     * this implementation, the equality / order of tweets is based on the tweet's text
     * (see `def incl`). Hence, a `TweetSet` could not contain two tweets with the same
     * text from different users.
     *
     *
     * The advantage of representing sets as binary search trees is that the elements
     * of the set can be found quickly. If you want to learn more you can take a look
     * at the Wikipedia page [1], but this is not necessary in order to solve this
     * assignment.
     *
     * [1] http://en.wikipedia.org/wiki/Binary_search_tree
     */
    abstract class TweetSet {
    
      /**
       * This method takes a predicate and returns a subset of all the elements
       * in the original set for which the predicate is true.
       *
       * Question: Can we implment this method here, or should it remain abstract
       * and be implemented in the subclasses?
       */
        def filter(p: Tweet => Boolean): TweetSet =
          this.filterAcc(p, new Empty())
      
      /**
       * This is a helper method for `filter` that propagetes the accumulated tweets.
       */
      def filterAcc(p: Tweet => Boolean, acc: TweetSet): TweetSet
    
      /**
       * Returns a new `TweetSet` that is the union of `TweetSet`s `this` and `that`.
       *
       * Question: Should we implment this method here, or should it remain abstract
       * and be implemented in the subclasses?
       */
        def union(that: TweetSet): TweetSet 
      
      /**
       * Returns the tweet from this set which has the greatest retweet count.
       *
       * Calling `mostRetweeted` on an empty set should throw an exception of
       * type `java.util.NoSuchElementException`.
       *
       * Question: Should we implment this method here, or should it remain abstract
       * and be implemented in the subclasses?
       */
        def mostRetweeted: Tweet = mostAcc(new Tweet("ini_max", "ini_max", 0))
        def mostAcc(max: Tweet): Tweet
      
      /**
       * Returns a list containing all tweets of this set, sorted by retweet count
       * in descending order. In other words, the head of the resulting list should
       * have the highest retweet count.
       *
       * Hint: the method `remove` on TweetSet will be very useful.
       * Question: Should we implment this method here, or should it remain abstract
       * and be implemented in the subclasses?
       */
        def descendingByRetweet: TweetList 
      
      /**
       * The following methods are already implemented
       */
    
      /**
       * Returns a new `TweetSet` which contains all elements of this set, and the
       * the new element `tweet` in case it does not already exist in this set.
       *
       * If `this.contains(tweet)`, the current set is returned.
       */
      def incl(tweet: Tweet): TweetSet
    
      /**
       * Returns a new `TweetSet` which excludes `tweet`.
       */
      def remove(tweet: Tweet): TweetSet
    
      /**
       * Tests if `tweet` exists in this `TweetSet`.
       */
      def contains(tweet: Tweet): Boolean
    
      /**
       * This method takes a function and applies it to every element in the set.
       */
      def foreach(f: Tweet => Unit): Unit
    }
    
    class Empty extends TweetSet {
        def filterAcc(p: Tweet => Boolean, acc: TweetSet): TweetSet = acc
        
        def union(that: TweetSet): TweetSet = that
        
        def mostAcc(max: Tweet): Tweet = max
        
         def descendingByRetweet: TweetList = Nil
      
      /**
       * The following methods are already implemented
       */
    
      def contains(tweet: Tweet): Boolean = false
    
      def incl(tweet: Tweet): TweetSet = new NonEmpty(tweet, new Empty, new Empty)
    
      def remove(tweet: Tweet): TweetSet = this
    
      def foreach(f: Tweet => Unit): Unit = ()
    }
    
    class NonEmpty(elem: Tweet, left: TweetSet, right: TweetSet) extends TweetSet {
    
        def filterAcc(p: Tweet => Boolean, acc: TweetSet): TweetSet = 
          if  (p(elem)) right.filterAcc(p, left.filterAcc(p, acc.incl(elem)))
          else  right.filterAcc(p, left.filterAcc(p, acc))
      
        def union(that: TweetSet): TweetSet = 
          ((left union right) union that) incl elem
          
           def mostAcc(max: Tweet): Tweet = 
             if (elem.retweets > max.retweets) right.mostAcc(left.mostAcc(elem)) 
             else right.mostAcc(left.mostAcc(max)) 
        
        def descendingByRetweet: TweetList = 
          new Cons(this.mostRetweeted, this.remove(this.mostRetweeted).descendingByRetweet)
          
             
      /**
       * The following methods are already implemented
       */
    
      def contains(x: Tweet): Boolean =
        if (x.text < elem.text) left.contains(x)
        else if (elem.text < x.text) right.contains(x)
        else true
    
      def incl(x: Tweet): TweetSet = {
        if (x.text < elem.text) new NonEmpty(elem, left.incl(x), right)
        else if (elem.text < x.text) new NonEmpty(elem, left, right.incl(x))
        else this
      }
    
      def remove(tw: Tweet): TweetSet =
        if (tw.text < elem.text) new NonEmpty(elem, left.remove(tw), right)
        else if (elem.text < tw.text) new NonEmpty(elem, left, right.remove(tw))
        else left.union(right)
    
      def foreach(f: Tweet => Unit): Unit = {
        f(elem)
        left.foreach(f)
        right.foreach(f)
      }
    }
    
    trait TweetList {
      def head: Tweet
      def tail: TweetList
      def isEmpty: Boolean
      def foreach(f: Tweet => Unit): Unit =
        if (!isEmpty) {
          f(head)
          tail.foreach(f)
        }
    }
    
    object Nil extends TweetList {
      def head = throw new java.util.NoSuchElementException("head of EmptyList")
      def tail = throw new java.util.NoSuchElementException("tail of EmptyList")
      def isEmpty = true
    }
    
    class Cons(val head: Tweet, val tail: TweetList) extends TweetList {
      def isEmpty = false
    }
    
    
    object GoogleVsApple {
      val google = List("android", "Android", "galaxy", "Galaxy", "nexus", "Nexus")
      val apple = List("ios", "iOS", "iphone", "iPhone", "ipad", "iPad")
    
        lazy val googleTweets: TweetSet = TweetReader.allTweets.filter(y => google.exists(x => y.text.contains(x)))
      lazy val appleTweets: TweetSet = TweetReader.allTweets.filter(y => apple.exists(x => y.text.contains(x)))
      
      /**
       * A list of all tweets mentioning a keyword from either apple or google,
       * sorted by the number of retweets.
       */
         lazy val trending: TweetList = (googleTweets.union(appleTweets)).descendingByRetweet
      }
    
    object Main extends App {
      // Print the trending tweets
      GoogleVsApple.trending foreach println
    }
    

    相关文章

      网友评论

        本文标题:Scala Learning on Coursera Week3

        本文链接:https://www.haomeiwen.com/subject/urqsjttx.html