, ,

ptt.ai, open source blockchain for AI Data Justice

[ 換日線報導中文連結 ]

The beginning of Data Justice” movement

By collaborating with online citizen and social science workers in Taiwan, Taiwan AILabs promotes the Data Justice” in the following principles:

  1. Prioritize Privacy and Integrity with goodwill for applications before data collection
    • In addition to Privacy Protection Acts, review the tech giant on potential abuse of monopoly position forcing users to give up their the privacy, or misuse user content and data for different purpose. In particular, organizations that became monopoly in the market should be reviewed regularly by local administration knowing if there is any abuse of data when users are unwillingly giving up their privacy.
  2. Users’ data and activities belong to users
    • The platform should remain neutral to avoid the misuse of user data and its creation.
  3. Public data collected should open for public researches
    • The government organization data holder is responsible for its openness while privacy and integrity are secured.For example, health insurance data for public health and smart city data for traffic researches.
  4. Regulate mandatory data openness
    • For the data critical to major public welfare controlled by monopoly private agency, we shall equip the administration the power for data openness.
    • For example, Taipower electric power usage data in Taiwan.

Monopoly now is worse than oil monopoly”

In 1882, the American oil giant John D. Rockefeller founded Standard Oil Trust and united with 40 oil-related companies to reach price control. In 1890, U.S. government sued Standard Oil Trust to prevent unfair monopoly. The antitrust laws have been formulated so as to ensure fair trade, fair competition, and prevent price manipulation. The governments of various countries followed the movement to establish anti-monopoly laws. In 1984 AT&T, a telecom giant, was split into several companies for antitrust laws. Microsoft was sued in 2001 for having Internet Explorer in its operating systems.

In 2003, Network Neutrality principle mandated ISPs (Internet Service Providers) to treat all data on Internet the same. FCC (Federal Communications Commission) successfully stopped Comcast, AT&T, Verizon and other giants from slowing down or throttling traffic based on application or domain level. Apple FaceTime, Google YouTube and Netflix are benefited from the principle. After 10 years, the oil and ISPs companies are no longer in the top 10 most valuable companies in the world. Instead, the Internet companies protected by Network Neutrality a decade ago have became the new giants. In the US market, the most valuable companies in the world dominate the market shares in many places. In February 2018, Apple reached 50% of the smart phone market share, Google dominated more than 60% of search traffic, and Facebook controlled nearly 70% of social traffic. Facebook and Google two companies have controlled 73% of the online Ads market. Amazon is on the path grabbing 50% of online shopping revenue. At China side, the situation is even worse. AliPay is owned by Alibaba and WePay is owned by WeChat. Two companies together contributed to 90% of China’s payment market.

When data became weapons, innovations and users become meatloaf

After series of AI breakthrough in the 2010’s, big data as import as crude oil. In internet era, users grant Internet companies permission on collecting their personal data for connecting with creditable users and content out of convenience. For example, the magazine publishes articles on Facebook because Facebook allows users to subscribe their article. At the same time, the publisher can manage their subscribers’ relationship with messenger system. The recommendation system helped to rank users and their content published. All the free services are sponsored from advertisements, which pay the cost of internet space and traffic. This model has encouraged more users to join the platform. Users and content accumulated on the platform also attracted more users to participate in. After 4G mobile era, mobile users are always online. It pushed the data aggregation to a whole new level. After merging and acquisition between Internet companies, a few companies stands out dominating user’s daily data today. New initiatives can no longer reach users easily by launching a new website or an app. On the other hand, Internet giants can easily issue a copycat of innovation, and leverage their traffic, funding and data resources to gain the territories. Startups had little choice but being acquired or burnout by unfair competition. Fewer and fewer stories about innovation from garages. More and more stories about tech giants’ copy startup ideas before they being shaped. There is a well quoted statement in China for example: Being acquired or die, new start-up now will never bypass the giants today.”. The phenomenon of monopoly also limited users’ choices. If a user does not consent to the data collection policy there is no alternative platform usually.

Net Neutrality repealed, giants eat the world

Nasim Aghdam’s anger at YouTube casts a nightmarish shadow over how it deals with creators and advertisers. She shot at the YouTube headquarters and caused 3 injuries. She killed herself in the end. At the beginning of Internet era, innovative content creators can be reasonably rewarded for their own creations. However, after the platform became monopoly, content providers find that their creation of content are ranked through opaque algorithms which ranked their content farther and farther away from their loyal subscribers. Before their subscribers can reach their content, poor advertising and fake news stand on the way. If the publisher wants to retain the original popularity, the content creator need also pay for advertisement. Suddenly reputable content providers are being charged for reaching their own loyal subscribers. Even worse, their subscribers’ information and user behavior are being consumed platform’s machine learning algorithms for serving targeting Ads. At the same time, the platform doesn’t really effectively screen the Advertisers, low quality fake news and fake ads are being served. It is known for scams and elections. After Facebook scandal, users discovered their own private data are being used through analysis tools to attack their mind. However at the #deletefacebook movement, users find no alternative platform due to the monopoly of technical giants. Friends and users are at the platform.

In December 2017, FCC voted to repeal the Net Neutrality principle for the reason that US had failed to achieved Net Neutrality. ISPs companies are not the ones to blame. After a decade, Internet companies who benefited from Net Neutrality are now the monopoly giants and Net Neutrality wasn’t able to be applied for their private ranking and censorship algorithm. Facebook for example offers mobile access to selected sites on its platform at different charge of data service which was widely panned for violating net neutrality principles. It is still active in 63 other countries around the world. The situation is getting worse in the era of AI. Tech giants have leveraged their data power and stepped into the automotive, medical, home, manufacturing, retail, and financial sectors. Through acquisitions by the giants rapidly accumulating new types of vertical data and forcing the traditional industries opening up their data ownership. The traditional industries are facing a even larger and smarter technology monopoly than the ISP or oil companies in a decades.

Taiwan experience may mitigate global data monopoly

Starting from the root cause, at the vertical point of view, The user who contributed the data” was motivated by the trust” of the their friends or the reputable content provider. In order to have the convenience and better service, the user consents to collecting their private data and grant the platform for further analysis. The user who contributed the content” consents to publishing their creation on the platform because the users are already on the platform. The platform now owns the power of the data and content that should originally belong to the users and publisher. For privacy, safety and convenience purpose, the platform prevents other platforms or users from consuming the data. Repeatedly, it results in an exclusive platform for users and content providers.

From horizontal point of view, in order to reach user, for data and traffic, the startup company signed unfair consent with the platform. In the end, the good innovations is usually swallowed by the platform because the platform also owns data and traffic for the innovations. Therefore, the platform will become larger and larger by either merging or copying the good innovation.

In order to break this vicious cycle and create fair competition environment for AI researches. Taiwan AILabs shared at 2018 3/27 Taipei Global Smart City Expo and a panel at 3/28 Taiwan German 2018 Global Solution Workshop with visiting experts and scholars on data policies making. Taiwan AILabs exchanged Taiwan’s unique experience on Data Justice. In the discussion we concluded opportunities that can potentially break the cycle.

The opportunities comes from the the following observations in Taiwan. Currently, the mainstream of the world’s online social network platforms is provided by private companies optimized for advertising revenue. Taiwan has a mature network of users, open source workers and open data campaigns. Internet users” in Taiwan are closer to online citizens”. Taiwan Internet platform, PTT(ptt.cc) for example, is not running for profit. The users elect the managers directly. Over the years, this culture has not cooled down. PTT is still dominating. Due to its equity of voice, it is difficult to be manipulated by Ads contribution. Fake news and fraud can be easily detected by its online evidence. PTT is a more of a major platform for public opinions compared with Facebook in Taiwan. With the collaboration between PTT and Taiwan AILabs, it now has its AI news writer to report news out of its users’ activities. The AI based new writer can minimize editor’s bias.

g0v.tw is another group of non profit organization in Taiwan focusing on citizen science and technology. It promotes the transparency and openness of government organizations through hackathon. It collaborated with the government, academia, non-governmental organizations, and international organizations for data openness on public data with open source collaboration in various fields.

Introducing ptt.ai project: using blockchain for Data Justice” in AI era

PTT is Taiwan’s most impactful online platform running for 23 years. It has its own digital currency – P coin, instant messaging, e-mail, users, elections and administrators elected by users. However, the services hosting the online platform are still relatively centralized. 

In the past, users chose a trusted platform for trusted information. For convenience and Internet space, users and content providers consent to unfair data collection. To avoid centralized data storage, blockchain technology gives new directions. Blockchain is capable to certify the users and content by its chain of trust. The credit system is not built on top of single owner and at the same time the content storage system is also built on top of the chain. It avoids the control of a single organization which becomes the super power.

Ptt.ai is a research starting to learn from PTT’s data economy, combining with the latest blockchain encryption technology and implementing in the decentralization approach.

The mainstream social network platforms in China and the United States created new super power of data through the creation of users and users’ own friends. It will continue to collect more information by horizontally merging industries with unequal data power. The launch of ptt.ai is a thinking of data ownership in different direction. We hope to study how to upgrade the system PTT in the era of AI, and use this platform as the basis for enabling more industries to cooperate with data platforms. It gives the data control back to users and mitigate the data monopoly happening. Ptt.ai will also collaborate with leading players on the automotive, medical, smart home, manufacturing, retail, and financial sectors who are interested in creating open community platform. 

Currently, the experimentation of technology started on an independent platform. It does not involve the operation or migration of the current PTT yet. Please follow the latest news of ptt.ai on http://ailabs.tw.

 

[2018/10/24 Updates]:

The open source project is on github now: https://github.com/ailabstw/go-pttai

 

, ,

Humanity with Privacy and Integrity is Taiwan AI Mindset

The 2018 Smart City Summit & Expo (SCSE) along with three sub-expos have taken place at Taipei Nangang Exhibition Center on March 27th with 210 exhibitors from around the world this year, exhibiting a diversity of innovative applications and solutions in building a smart city. Taiwan is known for the friendly and healthy business environment, ranked as 11th by World Bank. With 40+ years in ICT manufacturing and top level embedded systems, companies form a vigorous ecosystem in Taiwan. With an openness toward innovation, 17 out of 22 Taiwan cities have made it to the top in Intelligent Community Forum (ICF).

Ethan Tu, Taiwan AILabs Founder, gave a talk of “AI in Smart Society for City Governance” and laid out AI position in Taiwan that smart cities is for “humanity with privacy and integrity” besides “safety and convenience”. He said “AI in Taiwan is for humanity. Privacy and integrity will also be protected.”. The maturity of crowd participation, transparency and open data mindset are the key assets to drive Taiwan on smart cities to deliver humanity with privacy and integrity. Taiwan AILabs took social participating and AI collaborated editing open-source news site of http://news.ptt.cc as an example. The city governments are now consuming the news to detect the social events happening in Taiwan in real-time for the AI news’ robustness and reliability in scale. AILabs collaborated with Tainan city on AI drone project to simulate “Beyond Beauty” director Chi Po-lin who dies in helicopter crash. AILabs also established “Taipei Traffic Density Network (TTDN)” supporting real-time traffic detection and prediction with citizen’s privacy secured, no people nor car can be identified without necessity for Taipei city.

The Global Solutions (GS) Taipei Workshop 2018 with “Shaping the Future of an Inclusive Digital Society” took place at the Ambassador Hotel on March 28, 2018 in Taipei. It is co-organized by Chung-Hua Institute for Economic Research (CIER) and the Kiel Institute for the World Economy. The “Using Big Data to Support Economic and Societal Development” panel section was hosted by Dennis Görlich Head, Global Challenges Center, Kiel Institute for the World Economy. Chien-Chih Liu, Founder of the Asia IoT Alliance (AIOTA), Thomas Losse-Müller, Senior Fellow at the Hertie School of Governance, Reuben Ng, Assistant Professor, and Lee Kuan Yew School of Public Policy, National University of Singapore all participated in the discussion. Big data has been identified as oil for AI and economic growth. He shared the vision in his panel, “We don’t have to sacrifice for safety or convenience. On the other hand, Facebook movement is a good example that the tech giants who overlook privacy and integrity will be dumped.”

Ethan explained 3 key principles from Taiwan societies on big data collection. The following principles exist and are contributed by the mature open internet societies and movements in Taiwan. AILabs will promote them as fundamental guidances for data collection on medical records, government records, open communities and so on.

1. Data produced by users belongs to users. The policy makers shall ensure no solo authority such as social media platform is too dominant to user and force users on giving up data ownership.

2. Data collected by public agent belongs to public. The policy makers shall ensure the data collected by public agency shall provide the roadmap on opening data for general public on researches. g0v.tw for example is a NPO for the open data movement.

3. “Net Neutrality” is not only ISP but also for social media and content hosting service. Ptt.cc for example, persists in equality of voice without Ads. Over the time the equality of voice has overcome the fake news by standing-out evidences.

“Humanity is the direction for AILabs. Privacy and Integrity are what we insist.” said Ethan.Smart City workshop with Amsterdam Innovation Exchange Lab from Netherlands

SITEC from Malaysia visiting AILabs.tw

AI music composition passed Turing test

Music composition by computers has been of great research interests for long. Many techniques, such as rules, grammars, probabilistic graphical models, neural networks, and evolutionary methods, are applied to automatic music generation. In this article we describe our approach and the corresponding results.

AI music recognition test

Before describing our method, let’s test if you can distinguish AI music from human music. 5 AI tunes and 5 human tunes are gathered and shuffled, and you are encouraged to select 5, which you consider more machine-made, from them. The true composers will be revealed later in this article.

Breaking music into components

To compose a tune using computers, we break the tune into several components and generate each component individually (but dependently). A music work, e.g., a classical work or a modern pop song, usually consists of several voices, played by several instruments. In some works we can easily recognize one voice as the main melody and the other voices as foil. In this article, we are devoted to generation of monophonic main melodies.

A monophonic melody is a sequence of notes, consisting of pitches and duration. By collecting pitches of all notes we get what is called voice leading, and collecting duration yields the rhythm. There is usually another musical element underlying the main melody called a chord progression, which controls primary transition of moods. One can think of the chord progression as supporting branches and the melody as blooming flowers.

Techniques for musical components

In the above we introduced three musical components: chord progression, rhythm, and voice leading. Our composition method is to generate chord progressions and voice leading with probabilistic graphical models, and rhythms with rules.

The procedure to generate a song is described here. The time configuration, such as how long a song is and how many chords are there in a chord progression, is decided by human. The chord progression and the rhythm are then generated independently. Finally, voice leading is generated to fit the chord progression and the rhythm, completing the composition.

The answer of the AI music recognition test

Now we come back to the AI music recognition test. In the track list earlier in this article, A, D, H, I, and J are composed by computers with the procedure mentioned above. The others are extracted from Johann Sebastian Bach’s Well-Tempered Clavier Volume 1, as listed below.

B: Prelude No.2, bar 18.

C: Prelude No.10, bar 33.

E: Prelude No.2, bar 25.

F: Prelude No.5, bar 25.

G: Prelude No.2, bar 5.

Statistics of the AI music recognition test

Did you guess all the composers right? Let’s see how other people performed. We held this test on Taiwan’s PTT Bulletin Board System and had 85 participants. The resulting statistics is gathered below.

# correct guess (out of 5)012345total
# testee6937246385
tune idcomposer# testee judging it right% testee judging it right
AAI5160%
BBach4856%
CBach2428%
DAI4351%
EBach4148%
FBach3946%
GBach4249%
HAI4452%
IAI1922%
JAI3744%
average0.46

Most people gave 2 ~ 3 correct guesses out of 5, which is of similar accuracy as random selection, and even the test holder mixes them up when not paying attention. So don’t be too blue even if you are fooled.

 

Feautured Photo by  Brandon Giesbrecht / CC BY 2.0

Doppelgänger app – Can someone unlock your iPhone?

Could your doppelgänger trick your iPhone’s facial recognition feature into believing that you are the same person? The answer might lie within our newly-built facial recognition software "Doppelgänger app" at Ailabs.tw.

One of social media's hottest topics is "How can two celebrities, without any blood relation, look identical?" This discussion went viral on PTT, one of Taiwanese largest bulletin board system (BBS), right after Apple released the "Face ID" feature with iPhone X in November, 2017. Many people were wondering: Can Elva Hsiao(蕭亞軒) unlock Landy Wen(溫嵐)'s iPhone?

 

Read More

, ,

Meet JARVIS – The Engine Behind AILabs

In Taiwan AI Labs, we are constantly teaching computers to see the world, hear the world, and feel the world so that computers can make sense of them and interact with people in exciting new ways. The process requires moving a large amount of data through various training and evaluation stages, wherein each stage consumes a substantial amount of resources to compute. In other words, the computations we perform are both CPU/GPU bound and I/O bound.

This impose a tremendous challenge in engineering such a computing environment, as conventional systems are either CPU bound or I/O bound, but rarely both.

We recognized this need and crafted our own computing environment from day one. We call it Jarvis internally, named after the system that runs everything for Iron Man. It primarily comprises a frontdoor endpoint that accepts media and control streams from the outside world, a cluster master that manages bare metal resources within the cluster, a set of streaming and routing endpoints that are capable of muxing and demuxing media streams for each computing stage, and a storage system to store and feed data to cluster members.

The core system is written in C++ with a Python adapter layer to integrate with various machine learning libraries.

 

 

The design of Jarvis emphasizes realtime processing capability. The core of Jarvis enables data streams flow between computing processors to have minimal latency, and each processing stage is engineered to achieve a required throughput per second. For a long complex procedure, we break it down into smaller sub-tasks and use Jarvis to form a computing pipeline to achieve the target throughput. We also utilize muxing and demuxing techniques to process portions of the data stream in parallel to further increase throughput without incurring too much latency. Once the computational tasks are defined, the blue-print is then handed over to cluster master to allocate underlying hardware resources and dispatch tasks to run on them. The allocation algorithm has to take special care about GPUs, as they are scarce resources that cannot be virtualized at the moment.

Altogether, Jarvis becomes a powerful yet agile platform to perform machine learning tasks. It handles huge amount of work with minimum overhead. Moreover, Jarvis can be scaled up horizontally with little effort by just adding new machines to the cluster. It suits our needs pretty well. We have re-engineered Jarvis several times in the past few months, and will continue to evolve it. Jarvis is our engine to move fast in this fast-changing AI field.

 

Featured image by Nathan Rupert / CC BY

Face Recognition – The essential part of “Face ID”

Upon seeing a person, what enters our eyes is the person’s face. Human face plays an important role in our daily life when we interact and communicate with others. Unlike other biometrics such as fingerprint, identifying a person with its face can be a non-contract process. We can easily acquire face images of a person from a distance and recognize the person without interacting with the person directly. As a result, it is intuitive that we use human face as the key to build a Face Recognition system.

 

 

Over the last ten years, Face Recognition is a popular research area only in computer vision. However, with the rapid development of deep learning techniques in recent years, Face Recognition has become an AI  topic and more and more people are interested in this field. Many company such as Google, Microsoft and Amazon have developed their own Face Recognition tools and applications. In the late 2017, Apple also introduced the iPhone X with Face ID, which is a Face Recognition system aimed at replacing the fingerprint-scanning Touch ID to unlock the phone.

 

What Face Recognition can be used?

  • automated border system for arrival and departure in the airport
  • access control system for a company
  • criminal surveillance system for government
  • transaction certification for consumer
  • unlocking system for phone or computer

 

How Face Recognition Works?

Face Recognition system can be divided into three parts:

  • Face Detection : tell where the face is in the image
  • Face Representation : encode facial feature of a face image
  • Face Classification : determine which person is it

Face Detection

Locating the face in the image and finding the size of the face is what Face Detection do. Face Detection, is essentially an object-class detection problem for a given class of human face. For object detection in computer vision, a set of features is first extracted from the image and classifiers or localizers are run in sliding window through the whole image to find the potential bounding box, which is time-consuming and complex. With the approach of deep learning, object detection can be accomplished by a single neural network, from image pixels to bounding box coordinates and class probabilities, with the benefit of end-to-end training and real-time prediction. YOLO, which is an open source real-time object detection system, was built for Face Detection in our Face Recognition pipeline.

 

Face Representation

With the goal of comparing two faces, computing the distance of two face images pixel by pixel is somehow impracticable because of large computing time and resources. Thus, what we need to do is extract face feature to represent face image.

“The distance between your eyes and ears” and “The size of your noes and mouth”….

These facial features become an easy measurement for us to compare whether the two unknown face represent the same person. Eigen-face and genetic algorithm are used in old days to help discover these features. With the new deep learning technique, a deep neural network project each face image on a 128-dimensional unit hypersphere and generate feature vector of each image for us.

Regarding to transforming face images into Face Representations, OpenFace and DLIB are two commonly used model to generate feature vector. Some experiments are done for these two models and we found out that the face representation for DLIB model is more consistent between each frames for the same person and it indeed outperformed OpenFace model for accuracy test, as a result, DLIB was finally used as our face representation model.

 

Each vertical slice represents a face representation for a specific person from a image frame. The x-axis is the timestamp for each frame of video. This results show that dlib model does a better job at making consistent images-to-representations transformation for the face image of the same person between each frame.

 

Face Classification

Gathering the face representations for each person to build a face database, a classifier can be trained to classify each person. To stabilize the final classification results, “weighted moving average” is introduced into our system where we take classification results from the previous frames into consideration when determining the current classification results. With this mechanism, we found out that it smoothes the final classification results and has a better performance on accuracy test compared to classification result from a single image.

 

Feature image by Ars Electronica / CC BY

,

AI frontdesk – improve office security and working conditions

Imagine that someone in your office serves as doorkeeper, takes care of visitors and even cares about your working conditions, 24-7? One of our missions at Ailabs.tw is to explore AI solutions to address society’s problems and improve the quality of life of people and, we have developed one AI-powered front-desk to do all of the tasks mentioned above.

Based on 2016 annual report from Taiwan MOL (Ministry of Labor), the average work hours per year of Taiwanese employee is 2106 hours. Compared with OECD stats, this number ranked No.3 in the world which is just below Mexico and Costa Rica.

Recently on 4th, December, 2017,  the first review of the Labor Standards Act revision was passed. The new version of the law will allow flexible work-time arrangements and expand monthly maximum work hours up to 300. Other major changes of the amendment includes conditionally allowing employees to work 12 days in a row and reduction of a minimum 11 hour break between shifts down to 8 hours. The ruling party plans to finish second and third-reading procedure of this revision early next year (2018), and it will put 9-million Taiwanese labors in worse working environment.To get rid off the bad reputation of “Taiwan – The Island of Overwork “, a system which will notify both employee and employer that one has been extremely over-working, and the attendance report can not easily be manipulated is needed.

In May 2017, an employee Luo Yufen from Pxmart, one of Taiwan’s major supermarket chain, died from a long time of overwork after 7 days of being in the state of coma. However, the OSHA(Occupational Safety and Health Administration) initially find no evidence of overwork after reviewing the clocking report provided by Pxmart which looks ‘normal’. It wasn’t until August, when Luo’s case are requested for further investigation, that the Luo’s real working hours before her death proves her overwork condition.

Read more

PTT Hired First AI Reporter Named Copycat (記者快抄)

Just early this July, Ailab.tw released an AI reporter named Copycat(記者快抄) that produces news covering contents from Taiwan’s largest online forum PTT. It performs its job faster and produces more contents than its human colleagues in real time.

 

 

Now Copycat can write about 500 news articles automatically with popular topics every day.

The Requirements of Media Industry Nowadays

How to attract reader’s attention to produced content, and how to make content rank higher on social networks or search engine are getting more and more important for media industry. To meet this goal, reporters need to produce as many articles as they can, update fast enough and search for interesting materials all over the world. Copycat (記者快抄), an AI reporter, can do this task as well by generating news based on the most discussed topic from Taiwan’s largest online forums PTT.

In the beginning this was a side project. However, we found people are interested in this website, so we made some effort to improve it.

 

PTT, the biggest and non-commercial forum in Taiwan.

 

Generate News Automatically

PTT is the largest terminal-based bulletin board system (BBS) based in Taiwan, it has more than 1.5 million registered users with over 150,000 users online in peak time. This BBS is a non-commercial and open-source online platform which has over 20,000 boards covering a multitude of topics and generates 500,000 comments every day.

Our system now fetches important articles and posts from PTT every 30 minutes, parses them and posts the results on the dashboard. Likes and Boos are also collected to display on each posts, indicating the general public’s reactions.

Three Steps to Generate News Articles

Summary

First, summarization. Based on the popular posts on PTT forum, we describe the main idea in a few sentences. Article contents are broken down into sentences and a score is given to each sentence to represent how tight it connects with other sentences in the article. In addition, other deep learning techniques such as word embedding is also used to support the algorithm.

 

AI generated news from PTT

 

Fill-In

With a list of sentences candidates, we algorithmically pick and compile them into an article. We collect some widely used news templates so Copycat can mix the key sentences with these templates and turns out a common daily news.

Generate

The last part is to make the news article more readable. PTT users often write posts with their own styles and formats such as unexpected new lines and spaces. This make it hard for machine to read and understand the content. To deal with this problem we generate a model from newspaper text as a grammar corrector to teach Copycat how to write like a professional reporter.

Feature Image Selection

Only text is not enough. A news article should have images. The posts on PTT forum often includes some image links which can be a great resource. However, many of them do not have an image associated with the posts.

To search for an image like how a human editor does, we trained a multi-layer document retrieval RNN model as an image search engine. This engine grasps an image by comparing the text-similarity between the image’s description and the news content.

Now, our AI reporter Copycat can not only copy the images from the original post, but also can find a related image when needed.

 

The figure is auto-selected by Copycat based on text content

More to Come

The original categories on PTT and the topic extracted by Copycat are useful tags for people to find related news articles. The discussion and re-posts on the forum are potential data to show further and different standpoints of certain topics.

After importing our face and speech recognition module, Copycat can search for celebrities’ comment related to specific topic all over video clips on the Internet. This news knowledge graph can also benefit human-reporters.

We believe that artificial intelligence will be a support rather than a threat to help reporter produce news with higher quality. By automating the process of picking topics and generate articles online, reporters can move the needle on the content generation process and focus on creating insights or stories for readers.

Copycat is constantly improving and on the way to become a better reporter.

 

Featured image by filipe ferreira / CC BY

Recognize The Speech of Taiwan

We are exploring the new ways people interacts with technologies in the age of AI and speech is one of the most common and natural means of communication. In this post we are introducing our core recipes for automatic speech recognition system in Taiwan.

Cornerstone of Natural Human-Computer Interaction

Mobiles, IoT, wearable devices and robots. Our daily life are more and more likely to be surrounded by smart devices in the future. With the target to interact with them naturally,  just as with human-beings, we need to develop related AI techniques such as machine learning, computer vision, natural language processing and speech processing.

Speech Recognition, so called ASR, is one of the cornerstone that link all these interactions together. With deep-learning-based model and graphical decoder, ASR nowadays is getting more reliable on both accuracy and speed.

 

Unique Language Habits in Taiwan

Different usage of words, new phrases and sentence structures are generated each day in our modern society and between cultures. This is especially true in Taiwan where the language habits of Taiwanese people is different from other Mandarin speakers.

Due to these reasons, the current ASR solutions in the Mandarin-speaking space have limitation when it comes to supporting general usages in Taiwanese people’s daily life. For example, the biggest Taiwan forum and Internet community, PTT, invents hundreds of words and phrases every month. The newly-created words might be used repeatedly or spread frequently by millions of users in online chatting and posting.

Therefore, the challenges of building a localized ASR system are not only about training a local neural network model, but also about how the system updates and adapts rapidly to the dynamically evolved language.

 

 

With a Taiwan-specific language model, our ASR can be much more friendly for speech related applications in Taiwan.

 

Multi-Language Speech Recognition

Although Mandarin is the official language in Taiwan, a Mandarin-only ASR system cannot satisfy our goals. Taiwan is an place with many different cultures. In addition to Mandarin, other languages such as English, Taiwanese, Hakka and Indigenous languages are also used pretty often in Taiwan. To deal with this problem Ailabs.tw gathered linguistics, phonetics and machine learning experts to set up a standard process when ASR facing cross language requirements.

 

 

These processes includes enriching language model with multiple languages and handling mixed-up words and sentences. Our early ASR experiments on Taiwanese works and we are now enhancing our system to production-level.

 

ASR Applications in Ailabs.tw

ASR system is already a powering the front-desk system in Ailabs.tw now. When an employee arrives at the office, they interacts with the ASR system for door access and need ID cards or badges no more.

An employee ask for door access to the ASR system

Another application is to generate automatic transcripts or captions. Videos of news, conferences, interviews can be convert to text files in real-time using ASR.

News video can now generate live captions with ASR

Our ASR API is ready to open, contact us if you want further cooperation.

 

Looking Forward

Speed, accuracy, multi-language and rapid updates are core aspects of a easy to use ASR system. We are continuously improving these cores and trying different deep learning algorithms to reach to a point where AI is doing a better job than human in this field. If you are interested in working on this problem, please contact us, we are actively hiring!

 

featured image by Peter Coombe / CC BY