1. Halo Guest, pastikan Anda selalu menaati peraturan forum sebelum mengirimkan post atau thread baru.

Modifikasi AGC

Discussion in 'Wordpress' started by stockptc, Mar 4, 2011.

Thread Status:
Not open for further replies.
  1. stockptc

    stockptc Ads.id Fan

    Joined:
    Jan 21, 2011
    Messages:
    216
    Likes Received:
    112
    Location:
    aceh
    fungsinya untuk mengambil content lengkap salah satu url hasil pencarian bing

    Code:
    <?php
    
    // returns XHTML
    function grabArticleHtml($html) {
        $contentNode = grabArticle($html);
        return $contentNode->ownerDocument->saveXML($contentNode);
    }
    
    // returns DOMElement object
    function grabArticle($html) {
        // Replace all doubled-up <BR> tags with <P> tags, and remove fonts.
        $html = preg_replace('!<br ?/?>[ \r\n\s]*<br ?/?>!', '</p><p>', $html);
        $html = preg_replace('!</?font[^>]*>!', '', $html);
        $document = new DOMDocument();
        $html = mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8");
        @$document->loadHTML($html);
        
        $allParagraphs = $document->getElementsByTagName('p');
    
    
        $topDivCount = 0;
        $topDiv = null;
        $topDivParas;
    
        $articleContent = $document->createElement('div');
        
    
        // Study all the paragraphs and find the chunk that has the best score.
        // A score is determined by things like: Number of <p>'s, commas, special classes, etc.
        for ($j=0; $j < $allParagraphs->length; $j++) {
            $parentNode = $allParagraphs->item($j)->parentNode;
    
            // Initialize readability data
            if (!$parentNode->hasAttribute('readability'))
            {
                $readability = $document->createAttribute('readability');
                $readability->value = 0;
                $parentNode->appendChild($readability);
    
                // Look for a special classname
                if($parentNode->hasAttribute('class') && $parentNode->getAttribute('class') != '')
                {
    
                    if (preg_match('/combx|comment|disqus|extra|foot|header|menu|remark|rss|shoutbox|sidebar|sponsor|ad-break|agegate|pagination|pager|popup/',$parentNode->getAttribute('class')) )
                    {
        
                            $readability->value -= 50;
    
    
                    } else if(preg_match('/article|body|content|entry|hentry|main|page|pagination|post|text|blog|story/',$parentNode->getAttribute('class')))                     {
    
                            $readability->value += 25;
                    }
    
                }
                
                
                // Look for a special ID
                if($parentNode->hasAttribute('id') && $parentNode->getAttribute('id') != '')
                {
                
                    if (preg_match('/(combx|comment|disqus|extra|foot|header|menu|remark|rss|shoutbox|sidebar|sponsor|ad-break|agegate|pagination|pager|popup)/', $parentNode->getAttribute('id'))) {
                        $readability->value -= 50;
                    } else if (preg_match('/article|body|content|entry|hentry|main|page|pagination|post|text|blog|story/', $parentNode->getAttribute('id'))) {
                        $readability->value += 25;
                    }
                }
    
            } else {
                $readability = $parentNode->getAttributeNode('readability');
            }
    
            // Add a point for the paragraph found
            if(strlen($allParagraphs->item($j)->textContent) > 10) {
                $readability->value++;
            }
    
            // Add points for any commas within this paragraph
            $readability->value += substr_count($allParagraphs->item($j)->textContent, ',');
        }
    
        // Assignment from index for performance. See http://www.peachpit.com/articles/article.aspx?p=31567&seqNum=5 
        $allElements = $document->getElementsByTagName('*');
        $topDiv = null;
        foreach ($allElements as $node) {
            if($node->hasAttribute('readability') && ($topDiv == null || (int)$node->getAttribute('readability') > (int)$topDiv->getAttribute('readability'))) {
                $topDiv = $node;
            }
        }
    
        if($topDiv == null) {
            $topDiv = $document->createElement('div', 'Sorry, readability was unable to parse this page for content.');
        } else {
            cleanStyles($topDiv);                    // Removes all style attributes
            
            $topDiv = killDivs($topDiv);                // Goes in and removes DIV's that have more non <p> stuff than <p> stuff
            $topDiv = killBreaks($topDiv);            // Removes any consecutive <br />'s into just one <br /> 
    
            // Cleans out junk from the topDiv just in case:
            $topDiv = clean($topDiv, 'form');
            $topDiv = clean($topDiv, 'object');
            $topDiv = clean($topDiv, 'table', 250);
    
                    $topDiv = clean($topDiv, 'a');
            $topDiv = clean($topDiv, 'ul');
            $topDiv = clean($topDiv, 'li');
            $topDiv = clean($topDiv, 'h1');
            $topDiv = clean($topDiv, 'h2');
            $topDiv = clean($topDiv, 'h3');
            $topDiv = clean($topDiv, 'h4');
            $topDiv = clean($topDiv, 'iframe');
            $topDiv = clean($topDiv, 'script');
    
        }
        
        $articleContent->appendChild($topDiv);
        
        return $articleContent;
    }
    
    function cleanStyles($node) {
        $elems = $node->getElementsByTagName('*');
        foreach ($elems as $elem) {
            $elem->removeAttribute('style');
        }
    }
    
    function killDivs ($node) {
        $divsList = $node->getElementsByTagName('div');
        $curDivLength = $divsList->length;
        
        // Gather counts for other typical elements embedded within.
        // Traverse backwards so we can remove nodes at the same time without effecting the traversal.
        for ($i=$curDivLength-1; $i >= 0; $i--) {
            $p = $divsList->item($i)->getElementsByTagName('p')->length;
            $img = $divsList->item($i)->getElementsByTagName('img')->length;
            $li = $divsList->item($i)->getElementsByTagName('li')->length;
            $a = $divsList->item($i)->getElementsByTagName('a')->length;
            $embed = $divsList->item($i)->getElementsByTagName('embed')->length;
    
            // If the number of commas is less than 10 (bad sign) ...
            if (substr_count($divsList->item($i)->textContent, ',') < 10) {
                // And the number of non-paragraph elements is more than paragraphs 
                // or other ominous signs :
                if (  $li > $p || $a > $p || $p == 0 || $embed > 0) {
                    if($img >0){}
                    else
                        $divsList->item($i)->parentNode->removeChild($divsList->item($i));
                }
            }
        }
        return $node;
    }
    
    function killBreaks ($node) {
        $pattern = '!(<br\s*/?>(\s|&nbsp;)*){1,}!';
        $xml = $node->ownerDocument->saveXML($node);
        $xml = preg_replace($pattern, '<br />', $xml);
        $f = $node->ownerDocument->createDocumentFragment();
        @$f->appendXML($xml); // @ to prevent PHP warnings
        $node->parentNode->replaceChild($f,$node); 
        return $node;
    }
    
    
    
    bersambung ...
     
  2. stockptc

    stockptc Ads.id Fan

    Joined:
    Jan 21, 2011
    Messages:
    216
    Likes Received:
    112
    Location:
    aceh
    sambungannya karena di atas ndak muat :

    Code:
    
    function clean($node, $tag, $minWords=1000000) {
        $targetList = $node->getElementsByTagName($tag);
        $_len = $targetList->length;
        
        for ($y=$_len-1; $y >=0; $y--) {
            $img = $targetList->item($y)->getElementsByTagName('img')->length;
            // If the text content isn't laden with words, remove the child:
            if (substr_count($targetList->item($y)->textContent, ' ') < $minWords) {
                if($img >0){}
                else
                    $targetList->item($y)->parentNode->removeChild($targetList->item($y));
            }
        }
        return $node;
    }
    ?>
    
    
    <?php define('BING_API_KEY', '');
    function pete_curl_get($url, $params){$post_params = array();
    foreach ($params as $key => &$val) {
    if (is_array($val)) $val = implode(',', $val);
    $post_params[] = $key.'='.urlencode($val);
    }
    $post_string = implode('&', $post_params);
    $fullurl = $url."?".$post_string;
    $ch = curl_init();curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);curl_setopt($ch, CURLOPT_URL, $fullurl);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040608'); //kamu bisa pake user agent yang lain, lihat listnya di sini www.user-agents.org
    $result = curl_exec($ch);curl_close($ch);
    return $result;
    }function perform_bing_web_search($termstring){$searchurl = 'http://api.bing.net/json.aspx?';
    $searchurl .= 'AppId='.'ABCDEFG'; //ganti ABCDEFG dengan kode api BING
    $searchurl .= '&Query='.urlencode($termstring);
    $searchurl .= '&Sources=Web';
    $searchurl .= '&Web.Count=5'; //jumlah list situs yang dihasilkan
    $searchurl .= '&Web.Offset=0';
    $searchurl .= '&Web.Options=DisableHostCollapsing+DisableQueryAlterations';
    $searchurl .= '&JsonType=raw';
    $response = pete_curl_get($searchurl, array());
    $responseobject = json_decode($response, true);if ($responseobject['SearchResponse']['Web']['Total']==0)return array();
    $allresponseresults = $responseobject['SearchResponse']['Web']['Results'];
    $result = array();
    foreach ($allresponseresults as $responseresult){$result[] = array('url' => $responseresult['Url'],'title' => $responseresult['Title'],'abstract' => $responseresult['Description'],);
    }return $result;
    }if (isset($_REQUEST['s'])) {
    $termstring = urldecode($_REQUEST['s']);
    } else {
    $termstring = '';}
    ?>
    
    <?php get_header(); ?>
    
            <div id="container">
                <div id="content" role="main">
    
    <?php if ( have_posts() ) : ?>
                    <h1 class="page-title"><?php printf( __( 'Search Results for: %s', 'twentyten' ), '<span>' . get_search_query() . '</span>' ); ?></h1>
                    <?php
                    /* Run the loop for the search to output the results.
                     * If you want to overload this in a child theme then include a file
                     * called loop-search.php and that will be used instead.
                     */
                     get_template_part( 'loop', 'search' );
                                    
                    ?>
    
    
    <?php else : ?>
    
                    <div id="post-0" class="post no-results not-found">
    
                        <div class="entry-content">
                            <p><?php _e( 'Sorry, but nothing matched your search criteria. Please try again with some different keywords.', 'twentyten' ); ?></p>
    
    
    
                          
    
    <?php function CleanFileNameBan($result){
    $bannedkey = array("sex","porn","adult","gambling","seks","SEKS",); //masukkan kata kunci satu persatu untuk menghindari kata-kata yang tidak diinginkan.
    $result = str_replace($bannedkey, '',$result);
    $result = trim($result);
    return $result;
    }
    ?>
     
    <?php $termstring = $s ?>
    
    <?php if ($s!='') {
    $bingresults = perform_bing_web_search($termstring);
    print '<p></p>';
    print '<p> or you can find more information in other site (search by bing) : </p>';
    foreach ($bingresults as $result) {
    print '<div class="post">';
    
    print '</div>';
    }
    
    $url = $result['url'];
    $html = file_get_contents($url);
    $isi =  grabArticlehtml($html);
    $ganti = array(" ");
    $isi = str_replace($ganti, "", $isi);
    
      $match=array("“","â€￾","‘","’","…","—","–","Â","Sponsored links","href=");
      $replace=array("","","'","'","...","-","-","","","");
      $isi = str_replace($match, $replace, $isi);
      print '<h2 class="entry-title">'.htmlspecialchars(CleanFileNameBan($result['title'])).'</h2><br />';
      echo  $isi;
      print '<p style="color:#777777">'.$result['url'].'</p>';
    
    }
    ?>
    
                                   </div><!-- .entry-content -->
                    </div><!-- #post-0 -->
    
    <?php endif; ?>
                </div><!-- #content -->
            </div><!-- #container -->
    
    <?php get_sidebar(); ?>
    <?php get_footer(); ?>
    
     
    ibrahim_dhamar likes this.
  3. uchinx

    uchinx Hero

    Joined:
    Aug 7, 2010
    Messages:
    585
    Likes Received:
    9
    Location:
    Purwodadi,wirosari
    ijin nyimak,modifikasi Mobil AGCnya...
     
  4. zonabisnis

    zonabisnis Super Hero

    Joined:
    Mar 24, 2010
    Messages:
    1,853
    Likes Received:
    197
    Location:
    Tangerang
    tinggalkan jejak dulu deh, pengen bisa oprek2 AGC nih
     
  5. janganan

    janganan Hero

    Joined:
    Feb 11, 2009
    Messages:
    575
    Likes Received:
    13
    Location:
    Depan Laptop
    luar biasa... ini yang di cari2....
    thanks penceted....
     
  6. pembantu

    pembantu Ads.id Fan

    Joined:
    Feb 23, 2011
    Messages:
    106
    Likes Received:
    1
    NUBI belum ngerti gan , gimana nerapinya :pusing:
     
  7. januaranas

    januaranas Super Hero

    Joined:
    Aug 10, 2010
    Messages:
    2,071
    Likes Received:
    45
    Location:
    Arema » Malang
    Nyimak .......:D
     
  8. JhezeR

    JhezeR Super Hero

    Joined:
    Dec 14, 2009
    Messages:
    1,356
    Likes Received:
    59
    Location:
    Universe
    saya coba praktekan dulu yoo bos
     
  9. shelfie

    shelfie Super Hero

    Joined:
    Aug 3, 2010
    Messages:
    1,911
    Likes Received:
    301
    contoh hasil script dong.. [full contennya kayak apa]
     
  10. stockptc

    stockptc Ads.id Fan

    Joined:
    Jan 21, 2011
    Messages:
    216
    Likes Received:
    112
    Location:
    aceh
    yang aman saja contohnya :D :
    stockfile.net/search/complete-list-of-dead-and-missing-chch-earthquake
     
  11. rambut_pirang

    rambut_pirang Ads.id Pro

    Joined:
    Jun 23, 2010
    Messages:
    458
    Likes Received:
    8
    Location:
    Sumatra Utara, batu baru, kota indrapura
    Itu code ditempelkan kemana bos, trus contoh webnya ada gak?
     
  12. net4idi

    net4idi Super Hero

    Joined:
    Feb 3, 2010
    Messages:
    799
    Likes Received:
    23
    Location:
    Medan, Indonesia
    letakinnya dimana mbak?
     
  13. stockptc

    stockptc Ads.id Fan

    Joined:
    Jan 21, 2011
    Messages:
    216
    Likes Received:
    112
    Location:
    aceh
    contohnya udah di kasih tuh ...

    seperti yang dulu2 di search.php
     
  14. janganan

    janganan Hero

    Joined:
    Feb 11, 2009
    Messages:
    575
    Likes Received:
    13
    Location:
    Depan Laptop
    kok error bro?
     
  15. stockptc

    stockptc Ads.id Fan

    Joined:
    Jan 21, 2011
    Messages:
    216
    Likes Received:
    112
    Location:
    aceh
    ya... kalau mau dipasang di localhost ya harus disesuaikan dulu
     
  16. masecho

    masecho Ads.id Pro

    Joined:
    May 6, 2010
    Messages:
    473
    Likes Received:
    10
    ikutan nyimak gan...............
     
  17. janganan

    janganan Hero

    Joined:
    Feb 11, 2009
    Messages:
    575
    Likes Received:
    13
    Location:
    Depan Laptop
    ooo... harus nempel di WP yah... sori
     
  18. az4net

    az4net Super Hero

    Joined:
    Jul 1, 2010
    Messages:
    1,340
    Likes Received:
    34
    Location:
    Di zona oksigen
    keren nih.. ini scriptnya di copy paste semua di search.php sebelum <?php get_header(); ?> ato dmn? makasih master
     
  19. hayelah

    hayelah Super Hero

    Joined:
    May 9, 2009
    Messages:
    2,238
    Likes Received:
    113
    Location:
    Nunukan-Lampung-Indramayu
    masih blm mudeng neh sob, ksh pencerahan dunk ???
    tuh script disimpan atau cm ditaruh di search.php theme kita n ditaruh nya disebelah mana ???
    atau tuh script di simpen dan diupload ke hosting kita ???
    thx ...
     
  20. geblek

    geblek Super Hero

    Joined:
    Nov 17, 2006
    Messages:
    1,640
    Likes Received:
    116
    ini agc cuma di search saja ya gan
     
Thread Status:
Not open for further replies.

Share This Page