Regex, Aouka\Text PHPのコード例

コード例 #1

0

ファイルを表示

ファイル: AbstractHandler.php プロジェクト: subtonix/aouka_lunch

 /**
  * Éclate une chaîne en séparant les tags html du texte.
  * 
  * @param string $sInputString Chaîne a éclater.
  * @param boolean $bCaptureTags Définit si les tags html doivent être capturés ou non.
  * @param boolean $bSeparateHtmlPlain Définit si les tags et les textes sont retournés dans deux sous-tableaux différents.
  * @return array 
  */
 protected function _splitTagText($sInputString, $bCaptureTags = true, $bSeparateHtmlPlain = false)
 {
     // On remplace par un placeholder les chevrons ouvrants n'étant pas suivis par :
     // !			=> <!DOCTYPE>, <!-- Commentaires -->
     // une lettre	=> <div>
     // /			=> </div>
     $sLeftAngleBracketReplace = uniqid('bracket');
     $sString = preg_replace("`<(?!!|[[:alpha:]]|/)`", $sLeftAngleBracketReplace, $sInputString);
     $oTagRegex = Regex::tag()->setModifiers($this->_m());
     // On explose la chaîne en fonction des balises html
     $aHtmlExplodeTexts = array_values($oTagRegex->split($sString, -1, $bCaptureTags ? PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE : PREG_SPLIT_NO_EMPTY));
     foreach ($aHtmlExplodeTexts as &$sHtmlExplodeText) {
         // Si ce n'est pas une balise html
         if (mb_substr($sHtmlExplodeText, 0, 1, $this->_oString->getEncoding()) !== '<') {
             // on décode les potentiels chevrons n'appartenant pas à une balise
             $sHtmlExplodeText = str_replace($sLeftAngleBracketReplace, '<', $sHtmlExplodeText);
         }
     }
     if ($bCaptureTags && $bSeparateHtmlPlain) {
         $aTags = $aTexts = array();
         foreach ($aHtmlExplodeTexts as $iIndex => $sTagText) {
             if ($oTagRegex->test($sTagText)) {
                 $aTags[$iIndex] = $sTagText;
             } else {
                 $aTexts[$iIndex] = $sTagText;
             }
         }
         return array('tags' => $aTags, 'texts' => $aTexts);
     }
     return $aHtmlExplodeTexts;
 }

コード例 #2

0

ファイルを表示

ファイル: Focus.php プロジェクト: subtonix/aouka_lunch

 /**
  * Cible la partie de la chaîne de caractères commençant à l'occurrence $mStart et terminant à $mEnd.<br/>
  * Toutes les méthodes appelées par la suite ne s'appliqueront que sur la partie du texte qui a été ciblée par cette méthode.
  * Même la méthode \Aouka\Text\Manipulator::get() !<br/>Après avoir ciblé une partie du texte et lui avoir appliqué des transformations,
  * il est possible de revenir à l'intégralité du texte d'origine tout en conservant le résultat des modifications
  * avec la méthode \Aouka\Text\Manipulator::end(). 
  * 
  * <p>
  *	Exemples d'utilisation :<br/>
  *	Ciblage de la partie du texte située :
  *	<ul>
  *		<li>- jusqu'à la fin de la 3ème phrase (incluse) ->focus(NULL, array(3, \Aouka\Text\InterfaceText::SENTENCE).</li>
  *		<li>- avant le début de la 3ème phrase (exclue) ->focus(NULL, array(3, \Aouka\Text\Regex::sentence(), FALSE).</li>
  *		<li>- entre le 15ème mot (exclu) et le dernier mot "toto" (inclus) ->focus(array(15, \Aouka\Text\Regex::word(), FALSE), array(-1, "`(toto)`")).</li>
  *		<li>- après la 3ème lettre "a" (exclue) ->focus(array(3, \Aouka\Text\Regex::create("`(a)`"), FALSE), NULL).</li>
  *	<ul>
  * </p>
  * 
  * @param array|null $mStart Si $mStart vaut NULL, le ciblage commence à partir du début de la chaîne.<br/>
  * Si $mStart est un tableau, il doit contenir entre 2 et 3 éléments : <br/>
  *	<ul>
  *		<li>
  *			- Le premier de ces éléments est un nombre entier qui détermine la position n de l'ancre à partir de
  *			laquelle débute le ciblage. Si ce nombre est positif le ciblage commencera à la énième ancre à partir
  *			du début de la chaîne ; s'il est négatif, le ciblage débutera à la énième ancre à partir de la fin.
  *			S'il vaut 0, l'ancre ne sera pas prise en compte pour le ciblage de début.
  *		</li>
  *		<li>
  *			- Le deuxième de ces éléments est objet \Aouka\Text\Regex, une expression rationnelle ou une constante de l'interface
  *			\Aouka\Text\InterfaceText. Il détermine l'ancre à utiliser.
  *		</li>
  *		<li>
  *			- Le troisième élément est optionnel. Il définit si l'ancre doit être incluse ou exclue du ciblage. Par
  *			défaut, l'ancre est incluse. Il peut être une expression rationnelle sous la forme d'une chaîne de caractères
  *			ou d'un objet \Aouka\Text\Regex. Cela peut aussi être une constante de la classe \Aouka\Text\Interface.
  *		</li>
  *	<ul>
  * @param array|null $mEnd Si $mEnd vaut NULL, le ciblage termine à la fin de la chaîne.<br/>
  * Si $mEnd est un tableau, il doit contenir entre 2 et 3 éléments : <br/>
  *	<ul>
  *		<li>
  *			- Le premier de ces éléments est un nombre entier qui détermine la position n de l'ancre à partir de
  *			laquelle termine le ciblage. Si ce nombre est positif le ciblage terminera à la énième ancre à partir
  *			du début de la chaîne ; s'il est négatif, le ciblage terminera à la énième ancre à partir de la fin.
  *			S'il vaut 0, l'ancre ne sera pas prise en compte pour le ciblage de fin.
  *		</li>
  *		<li>
  *			- Le deuxième de ces éléments est objet \Aouka\Text\Regex, une expression rationnelle ou une constante de l'interface
  *			\Aouka\Text\InterfaceText. Il détermine l'ancre à utiliser.
  *		</li>
  *		<li>
  *			- Le troisième élément est optionnel. Il définit si l'ancre doit être incluse ou exclue du ciblage. Par
  *			défaut, l'ancre est incluse. Il peut être une expression rationnelle sous la forme d'une chaîne de caractères
  *			ou d'un objet \Aouka\Text\Regex. Cela peut aussi être une constante de la classe \Aouka\Text\Interface.
  *		</li>
  *	<ul>	
  */
 public function __construct($mStart, $mEnd = null)
 {
     $aArgs = func_get_args();
     $aArgs[1] = isset($aArgs[1]) ? $aArgs[1] : null;
     for ($i = 0; $i <= 1; $i++) {
         $iArgIndex = $i + 1;
         if (!is_array($aArgs[$i]) && !is_null($aArgs[$i])) {
             throw ExceptionType::invalidArgument("Argument #{$iArgIndex} doit être un tableau ou null.", Exception::FROM_HANDLER);
         }
         if ($aArgs[$i] !== null) {
             $aParams = array_values($aArgs[$i]);
             if (count($aParams) < 2 || !is_int($aParams[0])) {
                 throw ExceptionType::invalidArgument("Argument #{$iArgIndex} n'est pas un tableau indexé ayant comme premier élément un entier positif et second élément une expression rationnelle.\n\t\t\t\t\t\tLe troisième élément est optionnel. C'est un booléen déterminant si l'ancre de focus doit être incluse dans le ciblage.", Exception::FROM_HANDLER);
             }
             try {
                 $aArray = array('regex' => Regex::create($aParams[1]), 'offset' => $aParams[0], 'include' => (bool) (isset($aParams[2]) ? $aParams[2] : self::$_bDefaultInclusion));
             } catch (\Exception $oException) {
                 throw ExceptionType::invalidArgument("Argument #{$iArgIndex} n'a pas comme second élément une expression rationnelle, un objet Aouka\\Text\\Regex ou une constante de Aouka\\Text\\InterfaceText.", Exception::FROM_HANDLER);
             }
             if (!$i) {
                 $this->_aInterval['start'] = $aArray;
             } else {
                 $this->_aInterval['end'] = $aArray;
             }
         }
     }
 }

コード例 #3

0

ファイルを表示

ファイル: Obfuscate.php プロジェクト: subtonix/aouka_lunch

 protected function _htmlHandling()
 {
     $aTagTexts = $this->_splitTagText($this->_oString->get(), true, false);
     $iTagScriptDepth = 0;
     $oRegexTag = \Aouka\Text\Regex::tag();
     foreach ($aTagTexts as &$sTagText) {
         if ($oRegexTag->test($sTagText)) {
             // S'il s'agit d'une balise script
             if (preg_match("`^<(/)?\\s*script`i", $sTagText, $aCaptures)) {
                 // Si la balise script est ouvrante,
                 // on incrémente la profondeur du script
                 if (!isset($aCaptures[1])) {
                     ++$iTagScriptDepth;
                 } else {
                     $iTagScriptDepth -= 1;
                     $iTagScriptDepth = $iTagScriptDepth < 0 ? 0 : $iTagScriptDepth;
                 }
             }
         } else {
             if (!$iTagScriptDepth) {
                 $sTagText = $this->_process($sTagText);
             }
         }
     }
     return $this->_oString->setString(implode('', $aTagTexts));
 }

コード例 #4

0

ファイルを表示

ファイル: LowerCase.php プロジェクト: subtonix/aouka_lunch

 protected function _htmlHandling()
 {
     $aHtmlPlainText = $this->_splitTagText($this->_oString->get());
     $oRegexTag = Regex::tag()->setModifiers($this->_m());
     foreach ($aHtmlPlainText as &$sTagText) {
         if (!$oRegexTag->test($sTagText)) {
             $sTagText = mb_strtolower($sTagText, $this->_oString->getEncoding());
         }
     }
     return $this->_oString->setString(implode('', $aHtmlPlainText));
 }

コード例 #5

0

ファイルを表示

ファイル: LimitWords.php プロジェクト: subtonix/aouka_lunch

 protected function _getWordRegex()
 {
     $sAdditionalChars = '';
     if ($this->_sAdditionalChars) {
         $aAdditionalChars = preg_split('//' . $this->_m(), $this->_sAdditionalChars, -1, PREG_SPLIT_NO_EMPTY);
         $sAdditionalChars = implode('', array_map(function ($sAdditionalChar) {
             return '|' . preg_quote($sAdditionalChar, '`');
         }, $aAdditionalChars));
     }
     return Regex::create("`((?:[^[:space:][:punct:]]{$sAdditionalChars})+)`")->setModifiers($this->_m());
 }

コード例 #6

0

ファイルを表示

ファイル: Slug.php プロジェクト: subtonix/aouka_lunch

 protected function _htmlHandling()
 {
     $aTagTexts = $this->_splitTagText($this->_oString->get());
     $oRegexTag = Regex::tag()->setModifiers($this->_m());
     foreach ($aTagTexts as &$sTagText) {
         if (!$oRegexTag->test($sTagText)) {
             $sTagText = $this->_process($sTagText);
         }
     }
     return $this->_oString->setString(implode('', $aTagTexts));
 }

コード例 #7

0

ファイルを表示

ファイル: GetWordsCount.php プロジェクト: subtonix/aouka_lunch

 protected function _chunkHtmlHandling()
 {
     $aTagTexts = $this->_splitTagText($this->_oString->get());
     $oRegexTag = Regex::tag();
     $iCount = 0;
     foreach ($aTagTexts as &$sTagText) {
         if (!$oRegexTag->test($sTagText)) {
             $iCount += $this->_process($sTagText);
         }
     }
     return $iCount;
 }

コード例 #8

0

ファイルを表示

ファイル: Ucfirst.php プロジェクト: subtonix/aouka_lunch

 protected function _htmlHandling()
 {
     $aHtmlPlainText = $this->_splitTagText($this->_oString->get());
     $oTagRegex = Regex::tag()->setModifiers($this->_m());
     $sEncoding = $this->_oString->getEncoding();
     foreach ($aHtmlPlainText as &$sValue) {
         if (!$oTagRegex->test($sValue)) {
             $sValue = self::_ucfirst($sValue, $sEncoding);
             break;
         }
     }
     return $this->_oString->setString(implode('', $aHtmlPlainText));
 }

コード例 #9

0

ファイルを表示

ファイル: WrapTag.php プロジェクト: subtonix/aouka_lunch

 /**
  * Définit les tags de début et de fin.
  * 
  * Les tags de fin sont définis à partir des tags de début.
  * 
  * @param string Balises HTML de début.
  * @throws \Exception
  */
 protected function _initTags()
 {
     $oRegexTag = Regex::tag()->setModifiers($this->_m());
     $aTags = array_filter(preg_split("`({$oRegexTag})`", $this->_sInputTags, -1, PREG_SPLIT_DELIM_CAPTURE));
     foreach ($aTags as $sTag) {
         if (!$oRegexTag->test(trim($sTag))) {
             throw ExceptionType::invalidArgument("Argument #1 doit être uniquement composé de balises HTML.", Exception::FROM_HANDLER);
         }
         $this->_sStartingTags = $this->_sInputTags;
         if (preg_match('`^<([[:alnum:]]+)`', $sTag, $aMatches)) {
             $this->_sEndingTags = "</{$aMatches[1]}>" . $this->_sEndingTags;
         }
     }
 }

コード例 #10

0

ファイルを表示

ファイル: Split.php プロジェクト: subtonix/aouka_lunch

 protected function _process($sString, $iSplitLength)
 {
     if ($iSplitLength < 1) {
         return false;
     }
     Regex::char()->setModifiers($this->_m())->matchAll($sString, $aCaptures);
     if (!$aCaptures[0]) {
         return $aCaptures[0];
     }
     $aChunks = array_chunk($aCaptures[0], $iSplitLength);
     foreach ($aChunks as $iIndex => $aChunk) {
         $aChunks[$iIndex] = implode('', $aChunk);
     }
     return $aChunks;
 }

コード例 #11

0

ファイルを表示

ファイル: Each.php プロジェクト: subtonix/aouka_lunch

 /**
  * Formate sous forme littérale l'expression rationnelle à utiliser.
  * 
  * @return void
  * @throws \Exception
  */
 protected function _formatPattern()
 {
     // Si on reçoit une chaîne de caractères, on la considère comme un pattern valide
     if (is_string($this->_mPattern)) {
         return;
     }
     // Si le pattern est un entier, c'est qu'il s'agit d'une constante
     // on construit alors sa regex
     if (is_int($this->_mPattern)) {
         $this->_mPattern = Regex::create($this->_mPattern);
     }
     // Si le pattern est un objet \Aouka\Regex, on construit l'expression sous sa forme littérale
     if ($this->_mPattern instanceof Regex) {
         $this->_mPattern = '`(' . $this->_mPattern->getExpression() . ')`' . $this->_m();
         return;
     }
     throw ExceptionType::invalidArgument("Le pattern fourni n'est pas correct.", Exception::FROM_HANDLER);
 }

コード例 #12

0

ファイルを表示

ファイル: AutoLink.php プロジェクト: subtonix/aouka_lunch

 public function __construct($cLinkType, $aAttributes = array(), $aUrl = array())
 {
     $this->_iLinkType = $cLinkType;
     foreach ($aAttributes as $sName => $sValue) {
         $this->_sAttributes .= ' ' . $sName . '="' . $sValue . '"';
     }
     if (isset($aUrl[self::QUERY]) && isset($aUrl[self::QUERY][self::QUERY_DATA])) {
         $this->_aUrl[self::QUERY] = http_build_query($aUrl[self::QUERY][self::QUERY_DATA], isset($aUrl[self::QUERY][self::QUERY_NUMERIC_PREFIX]) ? $aUrl[self::QUERY][self::QUERY_NUMERIC_PREFIX] : '', isset($aUrl[self::QUERY][self::QUERY_ARG_SEPARATOR]) ? $aUrl[self::QUERY][self::QUERY_ARG_SEPARATOR] : ini_get('arg_separator.output'), isset($aUrl[self::QUERY][self::QUERY_ENC_TYPE]) ? $aUrl[self::QUERY][self::QUERY_ENC_TYPE] : PHP_QUERY_RFC1738);
     }
     if ($this->_iLinkType === self::LINK_EMAIL) {
         $this->_oRegex = Regex::email();
     } else {
         $this->_oRegex = Regex::url();
         if (isset($aUrl[self::FRAGMENT])) {
             $this->_aUrl[self::FRAGMENT] = $aUrl[self::FRAGMENT];
         }
     }
 }

コード例 #13

0

ファイルを表示

ファイル: WrapMirror.php プロジェクト: subtonix/aouka_lunch

 /**
  * Définit l'appendice à partir du prependice
  */
 protected function _setAppendice()
 {
     $sUnicodeModifier = $this->_m();
     $oSpaceRegex = Regex::space()->setModifiers($sUnicodeModifier);
     $aSpaceWords = preg_split("`({$oSpaceRegex})`{$sUnicodeModifier}", $this->_sPrependice, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
     $aWords = $aSpaces = array();
     foreach ($aSpaceWords as $iKey => $sSpaceWord) {
         if ($oSpaceRegex->test($sSpaceWord)) {
             $aSpaces[$iKey] = $sSpaceWord;
         } else {
             $aWords[$iKey] = $sSpaceWord;
         }
     }
     foreach ($aWords as &$sWord) {
         $aStringWordChars = preg_split("``{$sUnicodeModifier}", $sWord);
         foreach ($aStringWordChars as &$aStringWordChar) {
             $aStringWordChar = Mirror::char($aStringWordChar);
         }
         $sWord = implode('', array_reverse($aStringWordChars));
     }
     $aMergedSpaceWords = self::_coordinatingMerge($aWords, $aSpaces);
     $this->_sAppendice = implode('', array_reverse($aMergedSpaceWords));
 }

コード例 #14

0

ファイルを表示

ファイル: GetSentences.php プロジェクト: subtonix/aouka_lunch

 protected function _globalHtmlHandling()
 {
     $aTexts = $this->_splitTagText($this->_oString->get(), false);
     Regex::sentence()->setModifiers($this->_m())->matchAll(implode('', $aTexts), $aMatches);
     return $aMatches[0];
 }

コード例 #15

0

ファイルを表示

ファイル: Position.php プロジェクト: subtonix/aouka_lunch

 /**
  * Retourne un tableau des positions des éléments html et plain dans la chaîne de caractères $this->_oString.
  * 
  * Par exemple pour $this->_oString->get() = "[div]Toto", le tableau retourné sera comme suit :
  * [
  *		[0] => [
  *			[element]	=> [div],
  *			[length]	=> 5,
  *			[pos]		=> [
  *				[string]	=>	[0, 5] // Positions de début et de fin dans la chaîne de caractères.
  *			]
  *		],
  *		[1] => [
  *			[element]	=> Toto,
  *			[length]	=> 4,
  *			[pos]		=> [
  *				[string]	=>	[5, 9], // Positions de début et de fin dans la chaîne de caractères.
  *				[text]		=>	[0, 4]  // Positions de début et de fin dans le texte plain.
  *			]
  *		]
  * ]
  * 
  * @return array Tableau des positions
  */
 protected function _getTagTextPositions()
 {
     if ($this->_oString->getType() === InterfaceText::PLAIN) {
         return $this->_getPlainTagTextPositions();
     }
     // On remplace par un placeholder les chevrons ouvrants n'étant pas suivis par :
     // !			=> <!DOCTYPE>, <!-- Commentaires -->
     // une lettre	=> <div>
     // /			=> </div>
     $sLeftAngleBracketReplace = uniqid('bracket');
     $sString = preg_replace("`<(?!!|[[:alpha:]]|/)`", $sLeftAngleBracketReplace, $this->_oString->get());
     // On explose la chaîne en fonction des balises html
     $oTagRegex = Regex::tag()->setModifiers($this->_m());
     $aHtmlExplodeTexts = array_values($oTagRegex->split($sString, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_OFFSET_CAPTURE));
     // Tableau des infos recueillies
     $aInfos = array();
     // Tableau des stats
     $aStats = array('tag_indexes' => array(), 'no_tag_indexes' => array());
     // Position de fin du texte précédent (càd du dernier élément qui n'est pas une balise)
     $iLastTextPos = 0;
     $aTagText = array();
     foreach ($aHtmlExplodeTexts as $iIndex => $aHtmlExplodeText) {
         $sElement = $aHtmlExplodeText[0];
         // Est-ce une balise html ?
         $bTag = mb_substr($sElement, 0, 1, $this->_oString->getEncoding()) == '<';
         // Si ce n'est pas une balise html
         if (!$bTag) {
             // on décode les potentiels chevrons n'appartenant pas à une balise
             $sElement = str_replace($sLeftAngleBracketReplace, '<', $sElement);
         }
         $aTagText[$iIndex] = $sElement;
         // Nombre de caractères de l'élément
         $iLength = mb_strlen($sElement, $this->_oString->getEncoding());
         // Stockage des positions de début et de fin de l'élément dans la chaîne de caractères
         $aPos = array('string' => array($aHtmlExplodeText[1], $aHtmlExplodeText[1] + $iLength));
         // Si l'élément n'est pas une balise, on stocke également ses positions de début et de fin
         // dans le texte affiché sur une page HTML
         if (!$bTag) {
             $aPos['text'] = array($iLastTextPos, $iLastTextPos + $iLength);
             $iLastTextPos += $iLength;
         }
         // Ajout des infos dans le tableau statistiques
         $aInfos[$iIndex] = array('element' => $sElement, 'type' => $bTag ? 'tag' : 'text', 'length' => $iLength, 'pos' => $aPos);
         $aStats[($bTag ? '' : 'no_') . 'tag_indexes'][] = $iIndex;
     }
     $iTotalStringLength = 0;
     if ($aInfos) {
         $aLastInfos = end($aInfos);
         $iTotalStringLength = $aLastInfos['pos']['string'][1];
     }
     return array('tag_text' => $aTagText, 'tag_text_pos' => $aInfos, 'stats' => array('string' => array('length' => $iTotalStringLength), 'text' => array('length' => mb_strlen(Regex::tag()->replace($this->_oString->get(), ''), $this->_oString->getEncoding())), 'tag_text_pos' => $aStats));
 }

コード例 #16

0

ファイルを表示

ファイル: StripHTMLTags.php プロジェクト: subtonix/aouka_lunch

 protected function _setAllowableTagNames()
 {
     Regex::create('`<(\\w+)(?:.+?)>`')->setModifiers($this->_m())->matchAll($this->_sAllowableTags, $aTags);
     $this->_aAllowableTagNames = $aTags[1];
 }

コード例 #17

0

ファイルを表示

ファイル: GetExplode.php プロジェクト: subtonix/aouka_lunch

 protected function _htmlHandling()
 {
     $sString = Regex::tag()->replace($this->_oString->get(), '');
     return $this->_process($sString);
 }

コード例 #18

0

ファイルを表示

ファイル: PositionAnalyzer.php プロジェクト: subtonix/aouka_lunch

 protected function _buildRegexPos()
 {
     // Positions des éléments répartis par l'expression rationnelle
     $this->_aRegexPos = array();
     if ($this->_getHTMLParsingMode() !== String::CHUNK_PARSING) {
         // Texte nettoyé de ses balises
         if ($this->_getStringType() === InterfaceText::HTML) {
             $oRegexTag = Regex::tag()->setModifiers($this->_m());
             $sPlainText = $oRegexTag->replace($this->_getString()->get(), '');
         } else {
             $sPlainText = $this->_getString()->get();
         }
         // On explose le texte plain selon l'expression rationnelle
         $aPlainTexts = array_values($this->_getRegex()->split($sPlainText, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_OFFSET_CAPTURE));
         // On alimente le tableau des positions de la regex dans le texte
         $this->_feedRegexPosArray($aPlainTexts);
     } else {
         // Récupération des positions des éléments séparés (balises html et textes plain)
         $aTagTextPositions = $this->getTagTextPos();
         // Longueur des chaînes de texte précédents
         $iPrevTextsLength = 0;
         // Pour chaque Tag/Text
         foreach ($aTagTextPositions as $aTagTextPos) {
             // S'il l'élément est un texte
             if ($aTagTextPos['type'] == 'text') {
                 // On l'explose par rapport à la regex, en récupérant les positions
                 $aPlainTexts = $this->_oRegex->split($aTagTextPos['element'], -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY | PREG_SPLIT_OFFSET_CAPTURE);
                 // On alimente le tableau des positions de la regex dans le texte
                 $iCurrentTextLength = $this->_feedRegexPosArray($aPlainTexts, $iPrevTextsLength);
                 $iPrevTextsLength += $iCurrentTextLength;
             }
         }
     }
     $iMatchingRegexIndex = $iNoMatchingRegexIndex = 0;
     foreach ($this->_aRegexPos as $iRegexPosIndex => $aRegexPos) {
         foreach ($aRegexPos['pos']['string'] as $aPos) {
             if ($aRegexPos['match_regex']) {
                 $this->_aStats['index']['matching_regex_elem'][$iMatchingRegexIndex] += array('regex_pos' => $iRegexPosIndex);
                 ++$iMatchingRegexIndex;
             } else {
                 $this->_aStats['index']['no_matching_regex_elem'][$iNoMatchingRegexIndex] += array('regex_pos' => $iRegexPosIndex);
                 ++$iNoMatchingRegexIndex;
             }
         }
     }
     // Si on n'est pas en chunk
     // et que le regex_pos précédent est égale au regex_pos actuel
     // on fusionne les 2, ainsi :
     /*
     	[0] => Array
     		(
     			[tag_text_pos] => 0
     			[chunk_pos] => 
     			[regex_pos] => 0
     		)
     
     	[1] => Array
     		(
     			[tag_text_pos] => 2
     			[chunk_pos] => 
     			[regex_pos] => 0
     		)
     */
     // devient
     /*
     		[0] => Array
     			(
     				[tag_text_pos] => array(0, 2)
     				[chunk_pos] => 
     				[regex_pos] => 0
     			)
     */
     if ($this->_getHTMLParsingMode() !== String::CHUNK_PARSING) {
         $this->_groupStatRegexByRegexPos();
     }
 }

コード例 #19

0

ファイルを表示

ファイル: SetCase.php プロジェクト: subtonix/aouka_lunch

 protected function _setCase($sInputString, $cCase)
 {
     $sModifier = $this->_m();
     switch ($cCase) {
         case self::INITIAL_CASE:
             $sOutputString = $sInputString;
             break;
         case self::UPPER_CASE:
             $sOutputString = mb_strtoupper($sInputString, $this->_oString->getEncoding());
             break;
         case self::LOWER_CASE:
             $sOutputString = mb_strtolower($sInputString, $this->_oString->getEncoding());
             break;
         case self::SNAKE_CASE:
             $sLoweredString = mb_strtolower($sInputString, $this->_oString->getEncoding());
             $sUnderscoredLoweredString = preg_replace('`([[:punct:][:space:]]+)`' . $sModifier, '_', $sLoweredString);
             $sOutputString = preg_replace('`^_|_$`', '', $sUnderscoredLoweredString);
             break;
         case self::CAPITALIZE_CASE:
             $sEncoding = $this->_oString->getEncoding();
             $sLoweredString = mb_strtolower($sInputString, $this->_oString->getEncoding());
             $sOutputString = preg_replace_callback('`(' . Regex::word() . ')`' . $sModifier, function (&$aWord) use($sEncoding) {
                 $sFirstLetter = mb_strtoupper(Encoding::substr($aWord[1], $sEncoding, 0, 1), $sEncoding);
                 $sEndingWord = Encoding::substr($aWord[1], $sEncoding, 1);
                 return $sFirstLetter . $sEndingWord;
             }, $sLoweredString);
             break;
         case self::TITLE_CASE:
             // Mise en majuscule de la première lettre de chaque mot
             $sCapitalizedString = $this->_setCase($sInputString, self::CAPITALIZE_CASE);
             // Mise en minuscule de la première lettre des petits mots de liaison
             $sEncoding = $this->_oString->getEncoding();
             $oLocale = Factory::create('Base', $this->_oString->getLocale());
             $sPatternLowerWords = '`(?<![[:alnum:]])(' . implode('|', $oLocale->getTitleCaseSmallWords()) . ')(?![[:alnum:]])`i';
             $sLoweredCapitalizedString = preg_replace_callback($sPatternLowerWords . $sModifier, function (&$aWord) use($sEncoding) {
                 return mb_strtolower($aWord[0], $sEncoding);
             }, $sCapitalizedString);
             // Mise en majuscule de la première lettre des mots suivants les caractères de ponctuation suivants : .-:!'?
             $aLoweredCapitalizedStrings = preg_split("`([.\\-:!'?])`", $sLoweredCapitalizedString, -1, PREG_SPLIT_DELIM_CAPTURE);
             foreach ($aLoweredCapitalizedStrings as &$sString) {
                 $sString = preg_replace_callback('`([a-zA-Z])(?:.*?)$`' . $sModifier, function (&$aWord) use($sEncoding) {
                     $sFirstLetter = mb_strtoupper(Encoding::substr($aWord[0], $sEncoding, 0, 1), $sEncoding);
                     $sEndingWord = Encoding::substr($aWord[0], $sEncoding, 1);
                     return $sFirstLetter . $sEndingWord;
                 }, $sString);
             }
             $sOutputString = implode('', $aLoweredCapitalizedStrings);
             break;
         case self::SENTENCE_CASE:
             $sEncoding = $this->_oString->getEncoding();
             $sOutputString = preg_replace_callback('`(' . Regex::sentence() . ')`' . $sModifier, function (&$aWord) use($sEncoding) {
                 $sFirstLetter = mb_strtoupper(Encoding::substr($aWord[1], $sEncoding, 0, 1), $sEncoding);
                 $sEndingWord = Encoding::substr($aWord[1], $sEncoding, 1);
                 return $sFirstLetter . $sEndingWord;
             }, $sInputString);
             break;
         case self::STUDLY_CAPS:
             $sEncoding = $this->_oString->getEncoding();
             $sOutputString = preg_replace_callback('`([a-zA-Z])`' . $sModifier, function (&$aChar) use($sEncoding) {
                 if (rand(0, 1)) {
                     return mb_strtolower($aChar[1], $sEncoding);
                 } else {
                     return mb_strtoupper($aChar[1], $sEncoding);
                 }
             }, $sInputString);
             break;
         case self::CAMEL_CASE:
             $sEncoding = $this->_oString->getEncoding();
             $sLoweredString = mb_strtolower($sInputString, $this->_oString->getEncoding());
             $sUcwordString = preg_replace_callback('`(' . Regex::word() . ')`' . $sModifier, function (&$aWord) use($sEncoding) {
                 $sFirstLetter = mb_strtoupper(Encoding::substr($aWord[1], $sEncoding, 0, 1), $sEncoding);
                 $sEndingWord = Encoding::substr($aWord[1], $sEncoding, 1);
                 return $sFirstLetter . $sEndingWord;
             }, $sLoweredString);
             $sOutputString = preg_replace('`([[:punct:][:space:]]+)`' . $sModifier, '', $sUcwordString);
             break;
         case self::DROMEDARY_CASE:
             // Mise en CamelCase
             $sCapitalizedString = $this->_setCase($sInputString, self::CAMEL_CASE);
             $sEncoding = $this->_oString->getEncoding();
             $sFirstLetter = mb_strtolower(Encoding::substr($sCapitalizedString, $sEncoding, 0, 1), $sEncoding);
             $sEndingWord = Encoding::substr($sCapitalizedString, $sEncoding, 1);
             $sOutputString = $sFirstLetter . $sEndingWord;
     }
     return $sOutputString;
 }

コード例 #20

0

ファイルを表示

ファイル: LimitChars.php プロジェクト: subtonix/aouka_lunch

 protected function _htmlHandling()
 {
     $sEncoding = $this->_oString->getEncoding();
     $aHtmlPlainText = $this->_splitTagText($this->_oString->get());
     // Par défaut la chaîne n'est ni coupée, ni n'a la taille maximale atteinte
     $bCutted = $bFitted = false;
     // Nombre de caractères parsés total de la chaîne
     $iTotalChars = 0;
     // Index de la cellule contenant le morceau de chaîne atteignant la limite
     $iCellIndex = null;
     $oTagRegex = Regex::tag()->setModifiers($this->_m());
     foreach ($aHtmlPlainText as $iIndex => &$sText) {
         // Pour les éléments en plain text
         if (!$oTagRegex->test($sText)) {
             // Si la limite n'a pas encore été atteinte ou dépassée
             if (!$bFitted && !$bCutted) {
                 $iNbChars = mb_strlen($sText, $sEncoding);
                 $iTotalChars += $iNbChars;
                 if ($this->_iMaxLength <= $iTotalChars) {
                     // Si la limite vient d'être dépassée
                     if ($this->_iMaxLength < $iTotalChars) {
                         $iTotalDelta = $iTotalChars - $this->_iMaxLength;
                         $sText = $this->_cut($sText, $sEncoding, $iNbChars - $iTotalDelta);
                         $bCutted = true;
                     } else {
                         $bFitted = true;
                     }
                     // On note la cellule du tableau à partir de laquelle on exécute la limite
                     $iCellIndex = $iIndex;
                 }
             } else {
                 $sText = '';
                 // Si la limite a été atteinte pile-poil et qu'on en est à supprimer du texte
                 if ($bFitted) {
                     $bCutted = true;
                 }
             }
         }
     }
     // Si le plain text a été coupé et qu'il existe un trimmarker
     if ($bCutted && $this->_sTrimMarker) {
         // Si le trimmarker n'est pas à prendre en compte dans la limite
         if (!$this->_bIncludeTrimMarker) {
             // On l'ajoute dans la cellule à partir de laquelle la limite a été exécutée,
             // en supprimant les espaces blancs à la fin du texte de celle-ci
             if ($iCellIndex !== null) {
                 if ($this->_bExactly) {
                     $aHtmlPlainText[$iCellIndex] = $aHtmlPlainText[$iCellIndex] . $this->_sTrimMarker;
                 } else {
                     $aHtmlPlainText[$iCellIndex] = rtrim($aHtmlPlainText[$iCellIndex]) . $this->_sTrimMarker;
                 }
             }
         } else {
             $iTrimMarkerLength = mb_strlen($this->_sTrimMarker, $sEncoding);
             $bCutted = $bFitted = false;
             $iTotalChars = 0;
             $iCellIndex = null;
             // Calcul de la taille que doit faire la chaîne, en prenant en compte la taille du trimmarker
             $iMaxStringLength = $this->_iMaxLength - $iTrimMarkerLength < 0 ? 0 : $this->_iMaxLength - $iTrimMarkerLength;
             foreach ($aHtmlPlainText as $iIndex => &$sText) {
                 // Pour les éléments en plain text
                 if (!$oTagRegex->test($sText)) {
                     // Si la limite n'a pas encore été atteinte ou dépassée
                     if (!$bFitted && !$bCutted) {
                         $iNbChars = mb_strlen($sText, $sEncoding);
                         $iTotalChars += $iNbChars;
                         if ($iMaxStringLength <= $iTotalChars) {
                             // Si la limite vient d'être dépassée
                             if ($iMaxStringLength < $iTotalChars) {
                                 $iTotalDelta = $iTotalChars - $iMaxStringLength;
                                 $sText = $this->_cut($sText, $sEncoding, $iNbChars - $iTotalDelta);
                                 $bCutted = true;
                             } else {
                                 $bFitted = true;
                             }
                             if (!$this->_bExactly) {
                                 // On ajoute le trimmarker en supprimant les espaces blancs à la fin du texte
                                 $sText = rtrim($sText) . $this->_sTrimMarker;
                             }
                         }
                     } else {
                         $sText = '';
                     }
                 }
             }
         }
     }
     return $this->_oString->setString(implode('', $aHtmlPlainText));
 }

コード例 #21

0

ファイルを表示

ファイル: GetStats.php プロジェクト: subtonix/aouka_lunch

 protected function _plainHandling()
 {
     $oRegex = Regex::word()->setModifiers($this->_m());
     return $this->_limitByRegex(3, $oRegex);
 }

PHP Aouka\Text Regexの例